假设您需要计算矩阵上的孤岛数量
{1, 1, 0, 0, 0}, {0, 1, 0, 0, 1}, {1, 0, 0, 1, 1}, {0, 0, 0, 0, 0}, {1, 0, 1, 0, 1}
当输入矩阵大小适合内存时,我们可以简单地使用DFS或BFS。
但是,如果输入矩阵很大而无法放入内存,该怎么办?
我可以将输入矩阵分块/拆分为不同的小文件,然后分别读取它们。
但是如何合并它们呢?
我陷入了如何合并它们的困境。我的想法是,合并它们时,我们必须阅读一些重叠的部分。但是,这样做的具体方法是什么?
当我在白板上绘制以下示例并逐行处理它时。合并左,然后合并顶部,这似乎行不通。
来自Matt的解决方案。
int topidx = col * 2; int botidx = topidx + 1;
使用联合查找,基本算法(无需担心内存)是:
1
简单,一点点地注意,就可以使用对矩阵的顺序访问和仅2行的内存来做到这一点:
对于每个其他行:
当然,关键是只要您在不同行中的链接集始终将链接指向下方。这不会损害算法的复杂性,而且,如果您使用自己的联合查找,则很容易实现。如果您使用的是库数据结构,则可以仅对每一行使用它,并自己跟踪不同行中的根集之间的链接。
因为这实际上是我最喜欢的算法之一,所以这里是Java的实现。这不是最易读的实现,因为它涉及一些低级技巧,但是效率极高且简短- 我会在性能非常重要的情况下写这种东西:
import java.util.Arrays; public class Islands { private static final String[] matrix=new String[] { " ############# ### ", " # ##### ## ", " # ## ## # # ", " ### ## # # ", " # ######### ## ## ", " ## ## ", " ########## ", }; // find with path compression. // If sets[s] < 0 then it is a link to ~sets[s]. Otherwise it is size of set static int find(int[] sets, int s) { int parent = ~sets[s]; if (parent>=0) { int root = find(sets, parent); if (root != parent) { sets[s] = ~root; } return root; } return s; } // union-by-size // If sets[s] < 0 then it is a link to ~sets[s]. Otherwise it is size of set static boolean union(int[] sets, int x, int y) { x = find(sets,x); y = find(sets,y); if (x!=y) { if ((sets[x] < sets[y])) { sets[y] += sets[x]; sets[x] = ~y; } else { sets[x] += sets[y]; sets[y] = ~x; } return true; } return false; } // Count islands in matrix public static void main(String[] args) { // two rows of union-find sets. // top row is at even indexes, bottom row is at odd indexes. This arrangemnt is chosen just // to make resizing this array easier. // For each value x: // x==0 => no set. x>0 => root set of size x. x<0 => link to ~x int cols=4; int[] setrows= new int[cols*2]; int islandCount = 0; for (String s : matrix) { System.out.println(s); //Make sure our rows are big enough if (s.length() > cols) { cols=s.length(); if (setrows.length < cols*2) { int newlen = Math.max(cols,setrows.length)*2; setrows = Arrays.copyOf(setrows, newlen); } } //Create sets for land in bottom row, merging left for (int col=0; col<s.length(); ++col) { if (!Character.isWhitespace(s.charAt(col))) { int idx = col*2+1; setrows[idx]=1; //set of size 1 if (idx>=2 && setrows[idx-2]!=0) { union(setrows, idx, idx-2); } } } //merge up for (int col=0; col<cols; ++col) { int topidx = col*2; int botidx = topidx+1; if (setrows[topidx]!=0 && setrows[botidx]!=0) { int toproot=find(setrows,topidx); if ((toproot&1)!=0) { //top set is already linked down union(setrows, toproot, botidx); } else { //link top root down. It does not matter that we aren't counting its size, since //we will shortly throw it aaway setrows[toproot] = ~botidx; } } } //count root sets, discard top row, and move bottom row up while fixing links for (int col=0; col<cols; ++col) { int topidx = col * 2; int botidx = topidx + 1; if (setrows[topidx]>0) { ++islandCount; } int v = setrows[botidx]; setrows[topidx] = (v>=0 ? v : v|1); //fix up link if necessary setrows[botidx] = 0; } } //count remaining root sets in top row for (int col=0; col<cols; ++col) { if (setrows[col*2]>0) { ++islandCount; } } System.out.println("\nThere are "+islandCount+" islands there"); } }