public inbox for goredo-devel@lists.stargrave.org Atom feed
* Memory exhaustion issues @ 2024-08-19 10:12 spacefrogg 2024-09-05 18:49 ` Sergey Matveev 0 siblings, 1 reply; 6+ messages in thread From: spacefrogg @ 2024-08-19 10:12 UTC (permalink / raw) To: goredo-devel Hi, In a fairly large project in the order of several 100k build targets I face memory exhaustion issues. The build closure has roughly the following shape: - ca. 50 immediate dependencies max. in a single redo-ifchange call at several levels of the dependency graph - max dependency chain length of 14 levels - 1 Python process per active target The build is run with a single active job `-j 1`. The following happens: For each target, the Python process eventually waits on a redo-ifchange call to finish its sub targets. One could expect that with a single job, only one process per dependency depth level (+1 for each redo-if*) would be created until reaching the end of the chain. But this is not what happens. Whenever a target calls redo-if*, redo activates a next-in-line target, which is not necessarily in the chain of the same process. Instead it is likely some other target from an earlier redo-if* call up the dependency chain. This leads to a large tree of processes, because the current chain is not finished. Instead it is waiting on a redo-if* call while another chain opens up – also eventually waiting on a redo-if* call and so forth. This creates hundreds of new processes effectively sleeping all the time (which is fine). Due to them being active processes, they consume a lot of memory until exhaustion (which is not fine). I believe the high fan out of 50 direct dependencies accommodates the growth of the active process tree, because it makes selecting the next target on a deeper level more unlikely. It is obvious that using Python immediately contributes to the severity of the issue but a) a change is out of the question for several reasons and b) it is just uncovering a more fundamental issue which would have hit the next guy on a project 3 times the size or so. There is no way, using the current redo-if* interface or any of its command-line options to control this development. My suggestion would be that, if in any way possible, redo should try to follow the dependency graph in depth-first-order to minimize the number of half-executed targets waiting on redo-if* calls. Thanks for reading! –Michael ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Memory exhaustion issues 2024-08-19 10:12 Memory exhaustion issues spacefrogg @ 2024-09-05 18:49 ` Sergey Matveev 2024-09-05 19:49 ` goredo 0 siblings, 1 reply; 6+ messages in thread From: Sergey Matveev @ 2024-09-05 18:49 UTC (permalink / raw) To: goredo-devel [-- Attachment #1: Type: text/plain, Size: 1084 bytes --] Greetings! *** spacefrogg [2024-08-19 12:12]: >My suggestion would be that, if in any way possible, redo should try to >follow the dependency graph in depth-first-order to minimize the number of >half-executed targets waiting on redo-if* calls. This is not possible, because, unlike Make, targets that are going to be built are not known in advance. We just can *assume* what will be built, according to our previous run, but we can not (have no right) to build them, if they are not explicitly requested by redo-* command. Each redo-* invocation literally tells redo to "here, build those targets too", and those targets also tells redo to build another ones. All of that are dynamically generated by invokers of redo-* commands. And we have to wait for their finishing, because as a rule we are inside some .do file, that will have other commands after redo* calls. Personally I see no way to change that behaviour somehow. That is the redo's nature. -- Sergey Matveev (http://www.stargrave.org/) OpenPGP: 12AD 3268 9C66 0D42 6967 FD75 CB82 0563 2107 AD8A [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Memory exhaustion issues 2024-09-05 18:49 ` Sergey Matveev @ 2024-09-05 19:49 ` goredo 2024-09-08 10:11 ` Sergey Matveev 0 siblings, 1 reply; 6+ messages in thread From: goredo @ 2024-09-05 19:49 UTC (permalink / raw) To: goredo-devel Hi, My apologies, this has gotten a long message. I see that I was not precise enough in my suggestion. I will try to sketch the issue and maybe we find a better resolve than the current implementation. Given, we run redo with a single job, just for the argument. We start building a target that depends directly on 10 targets, each of them also depends on 10 more targets, none of the targets repeat. This leaves us with a dependency tree of depth 3 and 111 targets total. Let us name the target on the first level target 1, on the next level targets 10-19 and the third level 100-199. Now, because we are building with one job, we enter target 1, hit redo-ifchange 10 11 12 ... 19, and block until the call returns. Right now we have three open processes, redo, target 1 and redo-ifchange. Right now redo is free to run anyone of targets 10-19. Let's say it picks target 10. Starts executing until it hits redo-ifchange 100 ... 109. Blocks again. Now, redo can pick any target from 11-19 and 100-109. We have two more processes open. Now, it is important at this point that we both can agree that any of the 19 targets can be selected. If you don't agree, chime in here. Let's continue and say that redo selects target 11, runs until redo-ifchange 110-119, picks target 12, ..., until it picks target 19 which blocks on targets 190-199. Each new 2-digit target opens two more processes that must stay open until enough of the 3-digit targets are finished. We have 23 processes open and blocked. Let's stop here and jump back to an alternative version. Red can pick from targets 11-19 and 100-109 again and selects target 100, executes it, returns and picks 101 and so on until 109 is finished. It must pick one of 11-19 now, takes 11. Now, we can repeat this argument for each of the two-digit targets. In all these runs, we had no more than six processes running, redo, target 1, redo-ifchange, one 2-digit target, redo-ifchange and one 3-digit target. In my first example, redo executed known and ready targets breadth first. The targets coming from the top-most redo-ifchange call (2-digit targets) were picked first. In my second example, redo executed targets depth-first. Of all known and ready targets, it prefered those of the most recent or the deepest redo-ifchange call. I hope I could show that I was not cheating and that the depth-first order was, in fact, saving on the number of processes. Why is that important to me? I have a project which needs to build 170k targets, distributed over a graph that is 13 layers deep. I observed that redo opens hundreds of targets from higher layers that all block on redo-ifchange. I hope I could get my point across. I am hopeful that redo could implement some sort of priority queue that favours targets of the last redo-ifchange call over more remote ones. Thanks for reading. :) –Michael ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Memory exhaustion issues 2024-09-05 19:49 ` goredo @ 2024-09-08 10:11 ` Sergey Matveev 2024-09-09 10:49 ` spacefrogg 0 siblings, 1 reply; 6+ messages in thread From: Sergey Matveev @ 2024-09-08 10:11 UTC (permalink / raw) To: goredo-devel [-- Attachment #1: Type: text/plain, Size: 1403 bytes --] Greetings! *** goredo [2024-09-05 21:49]: >Right now redo is free to run anyone of targets 10-19. Let's say it picks target 10. Starts executing until it hits redo-ifchange 100 ... 109. Blocks again. Now, redo can pick any target from 11-19 and 100-109. We have two more processes open. I see you point, ok. But how can we run targets in parallel? The very first process has redo-ifchange with 10 targets. It runs one of them. But it does not know when (or if at all) that running job will run any other targets further. So if we want to build two jobs in parallel, when we should run the second job? We do not know if already running one will run jobs further. If we will wait for its completion, then all spent time we were running only a single target, without even trying to run another job, wasted time. If goredo runs redo-ifchange with thousand of targets, then it runs runScript() function inside itself for each of those target. But it does not create additional process until it gets the job token. If we run with two jobs (-j 2) at most, and if already two jobs are running, then no additional process will be created, until any of those jobs finishes and returns job token back. Of course all of that will not work, if number of allowed jobs is infinite. -- Sergey Matveev (http://www.stargrave.org/) OpenPGP: 12AD 3268 9C66 0D42 6967 FD75 CB82 0563 2107 AD8A [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Memory exhaustion issues 2024-09-08 10:11 ` Sergey Matveev @ 2024-09-09 10:49 ` spacefrogg 2024-09-10 6:50 ` Sergey Matveev 0 siblings, 1 reply; 6+ messages in thread From: spacefrogg @ 2024-09-09 10:49 UTC (permalink / raw) To: goredo-devel Hi, > I see you point, ok. But how can we run targets in parallel? The very > first process has redo-ifchange with 10 targets. It runs one of them. I see no problem there. If redo has job tokens left, it spends them as I described earlier. Now, that I think more about it, I think it is a rather simple graph traversal problem on the dependency graph. In a graph like this: a -> b0 -> c00 -> d000 -> d001 -> c01 -> d010 -> d011 -> b1 -> c10 -> d100 -> d101 -> c11 -> d110 -> d111 Say, redo is started with 'a' and 2 job tokens. Then it picks 'b0' and 'b1', blocks on 'b1' first and selects 'c10'. Then it blocks on 'b0'. At this point, 'c00', 'c01' and 'c11' are available. I would like redo to now prefer tasks that are dependencies of 'b1', because it has already started 'c10'. Preferring everything coming from the 'b1' sub tree helps reducing the number of blocked processes. So, it starts 'c11'. Then it blocks on 'c10' and starts 'd100'. When it blocks 'c11', it, again, prefers 'd101' over 'd110' or 'd111' for the same reason. The path a->b1->c10->d100 is already active and finishing 'c10' should be preferred over opening anything that goes via 'c11' or 'b0' (which is still blocked on its 'c0*' targets). So, the general rules seem to be as follows. Each target is annotated with the depth at which is was discovered. Start at level 0, which are the explicit arguments to redo. With each redo-ifchange the levels of its argument list are updated with the current level + 1. For example, in the following graph: x -> y0 -> z00 -> z00 -> v0 Both 'x' and 'y0' depend on 'z00'. 'y0', 'v0' and 'z00' initially get level 1. Later 'z00' updated to level 2. This makes 'z00' more valuable than 'v0', because it appears deeper in the hierarchy and should preferred. (Because finishing 'z00' potentially closes a long chain of blocked processes.) The second important property is: Each node is ordered by order of discovery. You may still randomize the order of arguments to redo-ifchange, but this may not change the order of already discovered targets. Already known targets are just kept in their place and new targets are added to the end. This order is important to break ties if two targets have the same level. Coming back to the previous example with 2 job tokens. The order of discovered targets annotated with levels looks like this: a :0 b0 :1 b1 :1 c10 :2 # because it blocked on b1 first. So these are discovered first. c11 :2 c00 :2 # discovered later c01 :2 d000 :3 d001 :3 d110 :3 d111 :3 ... Deeper levels always win. So, after 'a' discovered 'b0' and 'b1' and started both, it discovered 'c10' and 'c11' at level 2. They are the new deepest level and 'c10' gets executed. Now, at the time 'b0' blocks and adds 'c00' and 'c01' to the list, 'c11' is still available and level 2 is still the highest priority. So, 'c11' is executed first, because it is higher on the list. By ordering targets on the same level this way, you are grouping targets that are on the same level and come from the same redo-ifchange call together. This ensures that, on the same level, you compute the dependencies of one redo-ifchange call, first, potentially finishing the task altogether and closing the process early. This also ensures that you compute deeper dependencies first, again, preferring to finish already opened processes before opening new ones on shallower levels. Again, you can order targets on the same level any way you want as long as the order does not change later and targets from the same redo-ifchange call are grouped together. This also means that if, say, 'c10' is discovered again on the same level, it is *not* moved in the list to group the targets together. > Of course all of that will not work, if number of allowed jobs is > infinite. True, if you allow infinite jobs, you are also not caring about memory, because you cannot know how much memory each process will take. By allow infinitely many processes, you already say that you think you have enough memory for all of them. All of this has been very specific, but I hope you get the idea. I am sure you have better implementations in mind that fulfill the same rules. Thanks for reading, –Michael ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Memory exhaustion issues 2024-09-09 10:49 ` spacefrogg @ 2024-09-10 6:50 ` Sergey Matveev 0 siblings, 0 replies; 6+ messages in thread From: Sergey Matveev @ 2024-09-10 6:50 UTC (permalink / raw) To: goredo-devel [-- Attachment #1: Type: text/plain, Size: 784 bytes --] Greetings! Ok, I agreed that if we can give job tokens taking the depth/priority in consideration, then your idea should work. Currently that is not possible easily, because there is single pipe, where job tokens is a single byte and who catches its first -- it will start the next job. Some kind of centralised arbiter will be required, that will know about each awaiting job (and its depth/priority) and will distribute job tokens not randomly. Seems to be not a hard task and seems that even compatibility to Make's jobserver should not be broken. Will try to look and implement that on a weekend, but not sure about free time. Thanks for suggestion! -- Sergey Matveev (http://www.stargrave.org/) OpenPGP: 12AD 3268 9C66 0D42 6967 FD75 CB82 0563 2107 AD8A [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 228 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-09-10 6:51 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-08-19 10:12 Memory exhaustion issues spacefrogg 2024-09-05 18:49 ` Sergey Matveev 2024-09-05 19:49 ` goredo 2024-09-08 10:11 ` Sergey Matveev 2024-09-09 10:49 ` spacefrogg 2024-09-10 6:50 ` Sergey Matveev