Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speed up chimerge #1

Open
zsyf102900 opened this issue Oct 27, 2018 · 4 comments
Open

speed up chimerge #1

zsyf102900 opened this issue Oct 27, 2018 · 4 comments

Comments

@zsyf102900
Copy link

i have 10million intervals to merge

is there need to add loop for topk_smallest_chiinterval to speed up

@lisette-espin
Copy link
Owner

Do you mean the merging part from lines 159 to 165?
If so, one can re-write that block of code using masks to change the desired row/cols of frequency_matrix. Then, you won't need the for-loops.

@zsyf102900
Copy link
Author

zsyf102900 commented Nov 14, 2018

later,i found it's not correct to add FOR loop from line 159 to 165
the truth is that each time we merge one pair intervals,actually the interval frequency_matrix has changed , consequently the chi2_val also changed ,we need to recalcute chi2_val each time

@zsyf102900
Copy link
Author

zsyf102900 commented Nov 14, 2018

i have another question from line 159 -165 which involves intervals-merging
we should not use (lower,upper) index ,each time we merge intervals ,(upper) index may be deleted.

  1. next time , you add frequency_matrix , upper index may not exists
  2. the (lower,upper) index may not correspond to interval_values

solutions:
we can reset_index interval _values as dataframe_df index

looking forwards to your reply

@zsyf102900
Copy link
Author

sorry, i couldnot upload my codes for the info_security regulation in huawei

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants