Permute Compute and Revert
This is the first of a (hopefully long) series of very short posts whose purpose is to communicate a simple trick I often find useful.
Tip: The argsort of a permutation is the inverse of that permutation¶
import numpy as np
some_data = np.arange(10)
random_permutation = np.random.permutation(len(some_data))
inverse_permutation = np.argsort(random_permutation)
some_data
permuted_data = some_data[random_permutation]
permuted_data
reverted = permuted_data[inverse_permutation]
reverted
Isn't that obvious?¶
You may think that this little nugget of information is completely obvious. You may be right. After all argsort returns the indexes into the array that will put it in order, which means that the argsort of the permutation gives the indexes into the permutation array which would undo that permutation, which is the inverse permutation by definition.
But I think this is only obvious in retrospect. A google search for "calculating the inverse of a permutation" yields things which are more likely to be useful to you if you want to know about the theory of permutation groups. Things like this and this. Even if you add in "in Python" to the end of that search you end up with things like this where the top answer is a python for loop which will be incredibly slow compared to just using argsort. To be fair lower down in the answers to that same question people recommend using argsort as a faster alternative and if you append "in numpy" instead of "in python" to the same search the top result uses argsort.
Some years back when I was trying to figure out how to calculate the inverse of a permutation I ran into only the first sort of search results which gave me the impression that coding up something to calculate an inverse permutation would be somewhat complicated and I just avoided it. But eventually it occured to me that argsort, a function I use all the time, was actually exactly what I had wanted all along. So for all you people out there that have gotten stuck in the theory of permutations (which by the way can be absolutely fascinating), you don't need to know anything about permutation theory or cycle notation or any of that, just use argsort.
Permute Compute Revert¶
If you didn't happen to land on this post via a google search for "how to calculate inverse permutations" then you might be thinking that this information has only very niche applications. But if you do data analysis in python you might be surprised just how useful this trick can be. I frequently will want to know a piece of information which is dependent on a rank ordering of various aspects of my data and more often than not I find that first permuting the ordering of the relevant information doing a (usually now simple) calculation and then applying the inverse permutation to the results can sometimes save me hundreds of lines of complicated buggy multiply indexed code with lots of python for loops and replace it with a couple of lines of simple numpy vectorized operations.
Although I may not use this paradigm as much as the "split apply combine" style of data processing embodied in the pandas group by operation. This "permute compute revert" style of transformation is just about as useful for dealing with relative rank order information as groupby is useful for dealing with group derived information.
Comments
Comments powered by Disqus