我有一个清单清单,如我所附的代码所示。如果有任何公共值,我想链接每个子列表。然后,我想用压缩的列表列表替换列表列表。 示例: 如果我有一个列表,[[1,2,3],[3,4]]我想要[1,2,3,4]。如果我有[[4,3],[1,2,3]]我想要的[4,3,1,2]。如果我有[[1,2,3],[a,b],[3,4],[b,c]]我想要的,[[1,2,3,4],[a,b,c]]或者[[a,b,c],[1,2,3,4]]我不在乎哪一个。
[[1,2,3],[3,4]]
[1,2,3,4]
[[4,3],[1,2,3]]
[4,3,1,2]
[[1,2,3],[a,b],[3,4],[b,c]]
[[1,2,3,4],[a,b,c]]
[[a,b,c],[1,2,3,4]]
我快到了…
我的问题是,当我有一个[[1,2,3],[10,5],[3,8,5]]想要的案例,[1,2,3,10,5,8]但使用当前代码时,[1,2,3,8,10,5]
[[1,2,3],[10,5],[3,8,5]]
[1,2,3,10,5,8]
[1,2,3,8,10,5]
这是我的代码:
import itertools a = [1,2,3] b = [3,4] i = [21,22] c = [88,7,8] e = [5,4] d = [3, 50] f = [8,9] g= [9,10] h = [20,21] lst = [a,b,c,i,e,d,f,g,h,a,c,i]*1000 #I have a lot of list but not very many different lists def any_overlap(a, b): sb = set(b) return any(itertools.imap(sb.__contains__, a)) def find_uniq(lst): ''' return the uniq parts of lst''' seen = set() seen_add = seen.add return [ x for x in lst if x not in seen and not seen_add(x)] def overlap_inlist(o_lst, lstoflst): ''' Search for overlap, using "any_overlap", of a list( o_lst) in a list of lists (lstoflst). If there is overlap add the uniq part of the found list to the search list, and keep track of where that list was found ''' used_lst =[ ] n_lst =[ ] for lst_num, each_lst in enumerate(lstoflst): if any_overlap(o_lst,each_lst): n_lst.extend(each_lst) used_lst.append(lst_num) n_lst= find_uniq(n_lst) return n_lst, used_lst def comb_list(lst): ''' For each list in a list of list find all the overlaps using 'ovelap_inlist'. Update the list each time to delete the found lists. Return the final combined list. ''' for updated_lst in lst: n_lst, used_lst = overlap_inlist(updated_lst,lst) lst[:] = [x for i,x in enumerate(lst) if i not in used_lst] lst.insert(0,n_lst) return lst comb_lst = comb_list(lst) print comb_lst
该脚本的输出是:
[[88, 7, 8, 9, 10], [1, 2, 3, 4, 50, 5], [21, 22, 20]]
我想要它,所以密钥按原始顺序排列:
[[88, 7, 8, 9, 10], [1, 2, 3, 4, 5, 50,], [21, 22, 20]]
在 5和50切换 在新LST [2]
我对python有点陌生。对于该问题的任何解决方案或对当前代码的评论,我将不胜感激。我不是计算机科学家,我想可能会有某种算法能够快速地做到这一点(也许是从集合论出发?)。如果有这样的算法,请指出正确的方向。
我可能会让这种方式变得更复杂,然后呢……谢谢!!
这是一种蛮力方法(可能更容易理解):
from itertools import chain def condense(*lists): # remember original positions positions = {} for pos, item in enumerate(chain(*lists)): if item not in positions: positions[item] = pos # condense disregarding order sets = condense_sets(map(set, lists)) # restore order result = [sorted(s, key=positions.get) for s in sets] return result if len(result) != 1 else result[0] def condense_sets(sets): result = [] for candidate in sets: for current in result: if candidate & current: # found overlap current |= candidate # combine (merge sets) # new items from candidate may create an overlap # between current set and the remaining result sets result = condense_sets(result) # merge such sets break else: # no common elements found (or result is empty) result.append(candidate) return result
>>> condense([1,2,3], [10,5], [3,8,5]) [1, 2, 3, 10, 5, 8] >>> a = [1,2,3] >>> b = [3,4] >>> i = [21,22] >>> c = [88,7,8] >>> e = [5,4] >>> d = [3, 50] >>> f = [8,9] >>> g= [9,10] >>> h = [20,21] >>> condense(*[a,b,c,i,e,d,f,g,h,a,c,i]*1000) [[1, 2, 3, 4, 5, 50], [88, 7, 8, 9, 10], [21, 22, 20]] >>> condense([1], [2, 3, 2]) [[1], [2, 3]]
condense_*()函数来自于该问题的答案。lst_OP来自问题的输入列表(不同大小),lst_BK-来自@Blckknght答案的测试列表(不同大小)。请参阅源代码。
condense_*()
lst_OP
lst_BK
测量表明,基于“不相交集”和“无向图的连接组件”概念的解决方案在两种输入类型上的执行效果均相似。
name time ratio comment condense_hynekcer 5.79 msec 1.00 lst_OP condense_hynekcer2 7.4 msec 1.28 lst_OP condense_pillmuncher2 11.5 msec 1.99 lst_OP condense_blckknght 16.8 msec 2.91 lst_OP condense_jfs 26 msec 4.49 lst_OP condense_BK 30.5 msec 5.28 lst_OP condense_blckknght2 30.9 msec 5.34 lst_OP condense_blckknght3 32.5 msec 5.62 lst_OP name time ratio comment condense_blckknght 964 usec 1.00 lst_BK condense_blckknght2 1.41 msec 1.47 lst_BK condense_blckknght3 1.46 msec 1.51 lst_BK condense_hynekcer2 1.95 msec 2.03 lst_BK condense_pillmuncher2 2.1 msec 2.18 lst_BK condense_hynekcer 2.12 msec 2.20 lst_BK condense_BK 3.39 msec 3.52 lst_BK condense_jfs 544 msec 563.66 lst_BK name time ratio comment condense_hynekcer 6.86 msec 1.00 lst_rnd condense_jfs 16.8 msec 2.44 lst_rnd condense_blckknght 28.6 msec 4.16 lst_rnd condense_blckknght2 56.1 msec 8.18 lst_rnd condense_blckknght3 56.3 msec 8.21 lst_rnd condense_BK 70.2 msec 10.23 lst_rnd condense_pillmuncher2 324 msec 47.22 lst_rnd condense_hynekcer2 334 msec 48.61 lst_rnd
要重现结果,请克隆要点并运行measure- performance-condense- lists.py
measure- performance-condense- lists.py