Skip to content

Conversation

@hossam26644
Copy link
Contributor

Here you go: @jeromekelleher
Still missing some tests, and I want to change the flag name of the python algorithm to match the msprime implementation.
As a reminder, this implementation consumes more memory than the when stopping at the youngest root. It could be the node table getting much larger, and could be a memory leak.

@codecov
Copy link

codecov bot commented Aug 5, 2025

Codecov Report

❌ Patch coverage is 91.17647% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.23%. Comparing base (1ab3389) to head (2ef10be).

Files with missing lines Patch % Lines
lib/msprime.c 90.00% 1 Missing and 2 partials ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2396   +/-   ##
=======================================
  Coverage   91.23%   91.23%           
=======================================
  Files          20       20           
  Lines       11831    11847   +16     
  Branches     2296     2300    +4     
=======================================
+ Hits        10794    10809   +15     
  Misses        568      568           
- Partials      469      470    +1     
Flag Coverage Δ
C 91.23% <91.17%> (+<0.01%) ⬆️
python 98.68% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jeromekelleher
Copy link
Member

Implementation looks good, it seems unlikely that it's a memory leak to me. Let's look at the size of the output tables I think. Probably this will be a function of your value of stop_time.

You need to run the C code through clang-format as a minimum to get CI passing.

@jeromekelleher
Copy link
Member

Note specific requirements for clang-format: https://tskit.dev/tskit/docs/stable/development.html#sec-development-c-requirements

@hossam26644 hossam26644 marked this pull request as ready for review August 8, 2025 14:38
@hossam26644
Copy link
Contributor Author

It is ready for review now.

Please check:

  • the tests I added and where I put them.
  • if we need extra tests to for the implementation (I can not think of another tests for the implementation itself)

Copy link
Member

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Some minor comment above.

Also needed:

  1. Some C unit tests. We should be able to add a test or two to test_ancestry.c to cover the functionality. Nothing elaborate, just making sure the code is covered by valgrind.
  2. We also need to cover the other models. This will require updating msp_merge_ancestors in a similar way. Best to start with algorithms.py on this.

lib/msprime.c Outdated
for (avl_node = self->non_empty_populations.head; avl_node != NULL;
avl_node = avl_node->next) {
pop_id = (tsk_id_t)(intptr_t) avl_node->item;
pop_id = (tsk_id_t) (intptr_t) avl_node->item;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to be a bunch of formatting noise here - which version of clang-format are you using?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

version 18.. I will remove this noise

@jeromekelleher
Copy link
Member

I'm going to mute this thread @hossam26644 - please ping me when you'd like a review

@hossam26644
Copy link
Contributor Author

@jeromekelleher check now please, and please check the implementation of merge_ancestors. It works and passes the tests that I put.. but I am not sure if I am missing something

Copy link
Member

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looking good. Needs more models though, in particular the pedigree model. You need to construct a pedigree in which full coalescence will happen and verify that you still go through the rest of the nodes (you'll need store_unary as well I guess, but that's true anyway?)


if stop_at_local_mrca is None:
stop_at_local_mrca = True
if not stop_at_local_mrca: # when set to False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment is redundant

CU_ASSERT_EQUAL(ret, 0);

ret = msp_run(&msp, t, ULONG_MAX);
CU_ASSERT_EQUAL(ret, 2); // exit on max time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the symbolic constant, no need for comment then

@hossam26644
Copy link
Contributor Author

hossam26644 commented Sep 7, 2025

@jeromekelleher
Ready for another review.

  • I disabled stop_at_local_mrca with fixed pedigree simulations since it make not sense (if I understand correctly)

One more thing:

I added a condition to stop simulations when there are no events to happen and stop_at_local_mrca is False, i.e., no coalescence, GC, recombination, or migration events; it used to give an error before, since simulations should stop one step before that. This error was raised for smc models only, while for Hudson's model there are always recombination then coalescence events (unless the seq length is 1 base, then it behaves like the smc models). These recombination and coalescence events happen even when the recombination rate is zero (for Hudson).

Now, if we can put a condition when simulations get to this infinite cycle of recombination and coalescence, we won't have to specify an end time; simulations can run until more time means nothing (no unary nodes no recombination no coalescence). I don't understand fully why we see this when recombination rate == zero, so maybe I am missing important details; but if think it will be useful, I can remove the requirement for an end time for simulations above the root.

@hossam26644 hossam26644 force-pushed the feat-simulate-after-local-mrca branch from 84de697 to f9af0ec Compare September 7, 2025 13:22
Copy link
Member

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The infinite waiting time problem is in general hard (there's some references to it in the documentation if you look for it). It would be great if we could keep simulating until there are no events for the SMC, but we'll have to be careful not to introduce any unexpected behaviours otherwise.

Re the pedigree model - I don't see why it's not relevant? We can construct a pedigree in which coalescence happens very quickly, but we still want to track how the ancestral material passes through the ancestors. What happens at the moment when partial or full coalescence happens within the pedigree?

@hossam26644
Copy link
Contributor Author

@jeromekelleher we can go for another round of review:

  • Simulate above root is implemented with no end_time set. Please check the implementation especially for dtwf (I had to do some gymnastics to make it work)
  • Test cases for with and without end_time are implemented
  • Test case for the pedigree model built

Copy link
Member

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following some of the logic, but I think I just need to check it out an play with it myself to get my head around it. I think we should just clean this much up, squash the commits and merge. We can register a few follow-ups issues to close before release.

algorithms.py Outdated
mig_dest = k
min_time = min(t_re, t_ca, t_gcin, t_gc_left, t_mig)
assert min_time != infinity
assert (min_time != infinity) or (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparing with the False singleton is an antipattern. Use not self.stop_at_local_mrca

algorithms.py Outdated
else:
self.t += min_time
if min_time == t_re:
if (min_time == infinity) and self.stop_at_local_mrca is False:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (min_time == infinity) and self.stop_at_local_mrca is False:
if (not self.stop_at_local_mrca) and (min_time == infinity):

Put the constant first so it gets short-circuited. (Not that it matters here, but just as a general principle)

"""


class PotentialInfiniteSimulationhWarning(UserWarning):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class PotentialInfiniteSimulationhWarning(UserWarning):
class PotentialInfiniteSimulationWarning(UserWarning):

lib/msprime.c Outdated
out:
msp_safe_free(parents);
msp_safe_free(segment_mem);
if (coalescence_occurred == false && recombination_occurred == false && ret == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These out blocks are for error condition and cleanup handling only - move this block up to before the goto. Also you don't need to test for ret == 0 then.

Also, change to if ((!coalescence_occurred)) && (!recombination_occurred))

lib/msprime.c Outdated
goto out;
}
ret = msp_apply_demographic_events(self, self->next_demographic_event->time);
migration_occurred
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same point as before about comments on the same line

lib/msprime.c Outdated
self->time = cur_time;
ret = msp_dtwf_generation(self);
if (ret != 0) {
if (ret != 0 && ret != MSP_DTWF_DID_NOTHING) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to change the condition to if (ret < 0) .

lib/msprime.c Outdated
self->next_sampling_event++;
}

if (ret == MSP_DTWF_DID_NOTHING && migration_occurred == false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same style points as above

lib/msprime.c Outdated
msp_safe_free(node_trees);
msp_safe_free(n);
msp_safe_free(mig_tmp);
if (ret == MSP_DTWF_DID_NOTHING) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto on putting non cleanup logic in out block

@hossam26644 hossam26644 force-pushed the feat-simulate-after-local-mrca branch 2 times, most recently from f083429 to 2a250e9 Compare October 1, 2025 11:30
@hossam26644 hossam26644 force-pushed the feat-simulate-after-local-mrca branch from 2a250e9 to ec37d9f Compare October 4, 2025 06:37
@hossam26644
Copy link
Contributor Author

hossam26644 commented Dec 2, 2025

up @jeromekelleher

@jeromekelleher
Copy link
Member

@hossam26644 just FYI - I started looking into this in detail and it seemed easier to me to fix the stopping condition than to explain why we needed the max_time thing. I've fixed it for Hudson and am filling out the rest of the details now.

@jeromekelleher
Copy link
Member

Hmm - looks like there's a basic flaw in the logic for the multi-merger case, which wasn't really being tested. I'll need to spend more time on this to get to the bottom of it.

@hossam26644
Copy link
Contributor Author

I can invest some time into it. Actually, I would like to. But I am prioritising the paper draft now. If this is not urgent, maybe it can sit for a while until after the new year.

@jeromekelleher
Copy link
Member

Leave it with me - I'm going to see how far I can get with it today. There will be some follow-up work to do all right though, don't worry!

@jeromekelleher jeromekelleher force-pushed the feat-simulate-after-local-mrca branch from 239b8bb to be1b404 Compare December 3, 2025 15:17
@jeromekelleher jeromekelleher force-pushed the feat-simulate-after-local-mrca branch from be1b404 to 2ef10be Compare December 3, 2025 15:26
@jeromekelleher
Copy link
Member

OK, I think this is basically done. Can you review please @hossam26644? I'm going to log some follow-up issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants