// let's say there is a list of 1000+ URLs string[] urls = { "http://google.com", "http://yahoo.com", ... }; // now let's send HTTP requests to each of these URLs in parallel urls.AsParallel().ForAll(async (url) => { var client = new HttpClient(); var html = await client.GetStringAsync(url); });
这是问题所在,它会同时启动1000多个Web请求。有没有简单的方法来限制这些异步http请求的并发数量?这样,在任何给定时间下载的网页都不会超过20个。如何以最有效的方式做到这一点?
您绝对可以使用.NET 4.5 Beta在最新版的.NET异步中执行此操作。上一则来自’usr’的文章指出了Stephen Toub撰写的一篇不错的文章,但鲜为人知的消息是异步信号量实际上已将其纳入.NET 4.5 Beta版中
如果您看一下我们钟爱的SemaphoreSlim类(您应该使用它,因为它比原始类性能更好Semaphore),它现在拥有WaitAsync(...)一系列重载,带有所有预期的参数-超时间隔,取消令牌,所有您通常的计划朋友: )
SemaphoreSlim
Semaphore
WaitAsync(...)
Stephen’s还撰写了一篇有关Beta版发布的新.NET 4.5优点的最新博客文章,请参见.NET 4.5 Beta中的并行性新功能。
最后,这是一些有关如何使用SemaphoreSlim进行异步方法限制的示例代码:
public async Task MyOuterMethod() { // let's say there is a list of 1000+ URLs var urls = { "http://google.com", "http://yahoo.com", ... }; // now let's send HTTP requests to each of these URLs in parallel var allTasks = new List<Task>(); var throttler = new SemaphoreSlim(initialCount: 20); foreach (var url in urls) { // do an async wait until we can schedule again await throttler.WaitAsync(); // using Task.Run(...) to run the lambda in its own parallel // flow on the threadpool allTasks.Add( Task.Run(async () => { try { var client = new HttpClient(); var html = await client.GetStringAsync(url); } finally { throttler.Release(); } })); } // won't get here until all urls have been put into tasks await Task.WhenAll(allTasks); // won't get here until all tasks have completed in some way // (either success or exception) }
最后,但值得一提的是使用基于TPL的计划的解决方案。您可以在TPL上创建尚未启动的委托绑定任务,并允许自定义任务计划程序限制并发性。实际上,这里有一个MSDN示例:
另请参阅TaskScheduler。