我有三个数组,需要组合成一个三维数组。以下代码显示了Performance Explorer中的性能下降。有更快的解决方案吗?
for (int i = 0; i < sortedIndex.Length; i++) { if (i < num_in_left) { // add instance to the left child leftnode[i, 0] = sortedIndex[i]; leftnode[i, 1] = sortedInstances[i]; leftnode[i, 2] = sortedLabels[i]; } else { // add instance to the right child rightnode[i-num_in_left, 0] = sortedIndex[i]; rightnode[i-num_in_left, 1] = sortedInstances[i]; rightnode[i-num_in_left, 2] = sortedLabels[i]; } }
更新:
我实际上正在尝试执行以下操作:
//given three 1d arrays double[] sortedIndex, sortedInstances, sortedLabels; // copy them over to a 3d array (forget about the rightnode for now) double[] leftnode = new double[sortedIndex.Length, 3]; // some magic happens here so that leftnode = {sortedIndex, sortedInstances, sortedLabels};
使用Buffer.BlockCopy。它的全部目的是快速执行(请参见Buffer):
与System.Array类中的类似方法相比,此类提供了更好的操作原始类型的性能。
诚然,我还没有做过任何基准测试,但这就是文档。它也适用于多维数组。只需确保始终指定要复制的 字节数 ,而不是指定多少 个 元素,并且确保您正在处理原始数组。
另外,我还没有对此进行测试,但是如果将委托绑定到并直接调用它,则 可能会使 系统的性能降低System.Buffer.memcpyimpl一点。签名是:
System.Buffer.memcpyimpl
internal static unsafe void memcpyimpl(byte* src, byte* dest, int len)
它确实需要指针,但是我相信它已针对可能的最高速度进行了优化,因此,即使您手头有组装,我也认为没有任何方法可以使速度更快。
更新 :
由于要求(并满足我的好奇心),我对此进行了测试:
using System; using System.Diagnostics; using System.Reflection; unsafe delegate void MemCpyImpl(byte* src, byte* dest, int len); static class Temp { //There really should be a generic CreateDelegate<T>() method... -___- static MemCpyImpl memcpyimpl = (MemCpyImpl)Delegate.CreateDelegate( typeof(MemCpyImpl), typeof(Buffer).GetMethod("memcpyimpl", BindingFlags.Static | BindingFlags.NonPublic)); const int COUNT = 32, SIZE = 32 << 20; //Use different buffers to help avoid CPU cache effects static byte[] aSource = new byte[SIZE], aTarget = new byte[SIZE], bSource = new byte[SIZE], bTarget = new byte[SIZE], cSource = new byte[SIZE], cTarget = new byte[SIZE]; static unsafe void TestUnsafe() { Stopwatch sw = Stopwatch.StartNew(); fixed (byte* pSrc = aSource) fixed (byte* pDest = aTarget) for (int i = 0; i < COUNT; i++) memcpyimpl(pSrc, pDest, SIZE); sw.Stop(); Console.WriteLine("Buffer.memcpyimpl: {0:N0} ticks", sw.ElapsedTicks); } static void TestBlockCopy() { Stopwatch sw = Stopwatch.StartNew(); sw.Start(); for (int i = 0; i < COUNT; i++) Buffer.BlockCopy(bSource, 0, bTarget, 0, SIZE); sw.Stop(); Console.WriteLine("Buffer.BlockCopy: {0:N0} ticks", sw.ElapsedTicks); } static void TestArrayCopy() { Stopwatch sw = Stopwatch.StartNew(); sw.Start(); for (int i = 0; i < COUNT; i++) Array.Copy(cSource, 0, cTarget, 0, SIZE); sw.Stop(); Console.WriteLine("Array.Copy: {0:N0} ticks", sw.ElapsedTicks); } static void Main(string[] args) { for (int i = 0; i < 10; i++) { TestArrayCopy(); TestBlockCopy(); TestUnsafe(); Console.WriteLine(); } } }
结果:
Buffer.BlockCopy: 469,151 ticks Array.Copy: 469,972 ticks Buffer.memcpyimpl: 496,541 ticks Buffer.BlockCopy: 421,011 ticks Array.Copy: 430,694 ticks Buffer.memcpyimpl: 410,933 ticks Buffer.BlockCopy: 425,112 ticks Array.Copy: 420,839 ticks Buffer.memcpyimpl: 411,520 ticks Buffer.BlockCopy: 424,329 ticks Array.Copy: 420,288 ticks Buffer.memcpyimpl: 405,598 ticks Buffer.BlockCopy: 422,410 ticks Array.Copy: 427,826 ticks Buffer.memcpyimpl: 414,394 ticks
现在更改顺序:
Array.Copy: 419,750 ticks Buffer.memcpyimpl: 408,919 ticks Buffer.BlockCopy: 419,774 ticks Array.Copy: 430,529 ticks Buffer.memcpyimpl: 412,148 ticks Buffer.BlockCopy: 424,900 ticks Array.Copy: 424,706 ticks Buffer.memcpyimpl: 427,861 ticks Buffer.BlockCopy: 421,929 ticks Array.Copy: 420,556 ticks Buffer.memcpyimpl: 421,541 ticks Buffer.BlockCopy: 436,430 ticks Array.Copy: 435,297 ticks Buffer.memcpyimpl: 432,505 ticks Buffer.BlockCopy: 441,493 ticks
现在再次更改顺序:
Buffer.memcpyimpl: 430,874 ticks Buffer.BlockCopy: 429,730 ticks Array.Copy: 432,746 ticks Buffer.memcpyimpl: 415,943 ticks Buffer.BlockCopy: 423,809 ticks Array.Copy: 428,703 ticks Buffer.memcpyimpl: 421,270 ticks Buffer.BlockCopy: 428,262 ticks Array.Copy: 434,940 ticks Buffer.memcpyimpl: 423,506 ticks Buffer.BlockCopy: 427,220 ticks Array.Copy: 431,606 ticks Buffer.memcpyimpl: 422,900 ticks Buffer.BlockCopy: 439,280 ticks Array.Copy: 432,649 ticks
或者换句话说:他们很有竞争力;一般而言,memcpyimpl速度最快,但不必担心。
memcpyimpl