我 要合并 几个巨大的 可枚举序列 。这些列表 按已 被处理IEnumerable但 已经排序 。由于输入列表已排序,因此应该有可能在一次旅行中将它们合并,而无需重新排序任何内容。
IEnumerable
我想保留推迟执行的行为。
我试图写一个简单的算法来做到这一点(见下文)。但是,它看起来很难看,而且我敢肯定它可以优化。它可能存在一个更学术的算法…
IEnumerable<T> MergeOrderedLists<T, TOrder>(IEnumerable<IEnumerable<T>> orderedlists, Func<T, TOrder> orderBy) { var enumerators = orderedlists.ToDictionary(l => l.GetEnumerator(), l => default(T)); IEnumerator<T> tag = null; var firstRun = true; while (true) { var toRemove = new List<IEnumerator<T>>(); var toAdd = new List<KeyValuePair<IEnumerator<T>, T>>(); foreach (var pair in enumerators.Where(pair => firstRun || tag == pair.Key)) { if (pair.Key.MoveNext()) toAdd.Add(pair); else toRemove.Add(pair.Key); } foreach (var enumerator in toRemove) enumerators.Remove(enumerator); foreach (var pair in toAdd) enumerators[pair.Key] = pair.Key.Current; if (enumerators.Count == 0) yield break; var min = enumerators.OrderBy(t => orderBy(t.Value)).FirstOrDefault(); tag = min.Key; yield return min.Value; firstRun = false; } }
该方法可以这样使用:
// Person lists are already sorted by age MergeOrderedLists(orderedList, p => p.Age);
假设以下Person类存在于某处:
Person
public class Person { public int Age { get; set; } }
重复项应该保留,我们不在乎新序列中的顺序。您看到我可以使用的任何明显的优化吗?
这是我的第四篇文章(感谢@tanascius将其推进到更多的LINQ中):
public static IEnumerable<T> MergePreserveOrder3<T, TOrder>( this IEnumerable<IEnumerable<T>> aa, Func<T, TOrder> orderFunc) where TOrder : IComparable<TOrder> { var items = aa.Select(xx => xx.GetEnumerator()).Where(ee => ee.MoveNext()) .OrderBy(ee => orderFunc(ee.Current)).ToList(); while (items.Count > 0) { yield return items[0].Current; var next = items[0]; items.RemoveAt(0); if (next.MoveNext()) { // simple sorted linear insert var value = orderFunc(next.Current); var ii = 0; for ( ; ii < items.Count; ++ii) { if (value.CompareTo(orderFunc(items[ii].Current)) <= 0) { items.Insert(ii, next); break; } } if (ii == items.Count) items.Add(next); } else next.Dispose(); // woops! can't forget IDisposable } }
结果:
for (int p = 0; p < people.Count; ++p) { Console.WriteLine("List {0}:", p + 1); Console.WriteLine("\t{0}", String.Join(", ", people[p].Select(x => x.Name))); } Console.WriteLine("Merged:"); foreach (var person in people.MergePreserveOrder(pp => pp.Age)) { Console.WriteLine("\t{0}", person.Name); } List 1: 8yo, 22yo, 47yo, 49yo List 2: 35yo, 47yo, 60yo List 3: 28yo, 55yo, 64yo Merged: 8yo 22yo 28yo 35yo 47yo 47yo 49yo 55yo 60yo 64yo
通过.Net 4.0的Tuple支持进行了改进:
public static IEnumerable<T> MergePreserveOrder4<T, TOrder>( this IEnumerable<IEnumerable<T>> aa, Func<T, TOrder> orderFunc) where TOrder : IComparable<TOrder> { var items = aa.Select(xx => xx.GetEnumerator()) .Where(ee => ee.MoveNext()) .Select(ee => Tuple.Create(orderFunc(ee.Current), ee)) .OrderBy(ee => ee.Item1).ToList(); while (items.Count > 0) { yield return items[0].Item2.Current; var next = items[0]; items.RemoveAt(0); if (next.Item2.MoveNext()) { var value = orderFunc(next.Item2.Current); var ii = 0; for (; ii < items.Count; ++ii) { if (value.CompareTo(items[ii].Item1) <= 0) { // NB: using a tuple to minimize calls to orderFunc items.Insert(ii, Tuple.Create(value, next.Item2)); break; } } if (ii == items.Count) items.Add(Tuple.Create(value, next.Item2)); } else next.Item2.Dispose(); // woops! can't forget IDisposable } }