function arglexsort
Returns the indices that would lexicographically sort multiple arrays, treating them as columns of a structured array.
/tf/active/vicechatdev/patches/util.py
1955 - 1964
moderate
Purpose
This function performs lexicographical (dictionary-style) sorting on multiple arrays simultaneously, returning the indices that would sort the arrays. It's useful when you need to sort data by multiple keys in priority order, similar to SQL's ORDER BY with multiple columns. The first array has the highest sorting priority, followed by subsequent arrays as tiebreakers.
Source Code
def arglexsort(arrays):
"""
Returns the indices of the lexicographical sorting
order of the supplied arrays.
"""
dtypes = ','.join(array.dtype.str for array in arrays)
recarray = np.empty(len(arrays[0]), dtype=dtypes)
for i, array in enumerate(arrays):
recarray['f%s' % i] = array
return recarray.argsort()
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
arrays |
- | - | positional_or_keyword |
Parameter Details
arrays: A sequence (list or tuple) of numpy arrays or array-like objects. All arrays must have the same length. The arrays are treated as sorting keys in order of priority - the first array is the primary sort key, the second is used to break ties in the first, and so on. Each array can have any numeric or comparable dtype.
Return Value
Returns a numpy array of integer indices that would sort the input arrays in lexicographical order. The returned array has the same length as the input arrays. These indices can be used to reorder the original arrays (e.g., arrays[0][result] would give the sorted version of the first array).
Dependencies
numpy
Required Imports
import numpy as np
Usage Example
import numpy as np
def arglexsort(arrays):
dtypes = ','.join(array.dtype.str for array in arrays)
recarray = np.empty(len(arrays[0]), dtype=dtypes)
for i, array in enumerate(arrays):
recarray['f%s' % i] = array
return recarray.argsort()
# Example: Sort by first array, then by second array for ties
first_names = np.array(['John', 'Jane', 'John', 'Alice'])
ages = np.array([30, 25, 25, 30])
# Get sorting indices
indices = arglexsort([first_names, ages])
print(indices) # Output: [3 1 2 0]
# Apply indices to sort the arrays
sorted_names = first_names[indices]
sorted_ages = ages[indices]
print(sorted_names) # ['Alice' 'Jane' 'John' 'John']
print(sorted_ages) # [30 25 25 30]
Best Practices
- All input arrays must have the same length, otherwise the function will raise an error
- The function creates a structured numpy array internally, which may use significant memory for large datasets
- Arrays are sorted in the order they appear in the input list - first array has highest priority
- The function works with any numpy-compatible dtype that supports comparison operations
- For very large datasets, consider using numpy.lexsort() directly as it may be more memory efficient
- The returned indices can be used with fancy indexing to reorder any array of the same length
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function search_indices 60.5% similar
-
function dimension_sort 50.2% similar
-
function python2sort 47.4% similar
-
function sort_topologically 41.8% similar
-
function cross_index 41.8% similar