如何在没有时间的情况下等待所有的程序完成。睡觉?

这段代码选择同一文件夹中的所有 xml 文件,作为被调用的可执行文件,并异步地对回调方法中的每个结果应用处理(在下面的示例中,只打印文件的名称)。

如何避免使用 sleep 方法来阻止 main 方法退出?我在处理频道时遇到了问题(我假设这就是同步结果所需要的) ,所以任何帮助都是值得感激的!

package main


import (
"fmt"
"io/ioutil"
"path"
"path/filepath"
"os"
"runtime"
"time"
)


func eachFile(extension string, callback func(file string)) {
exeDir := filepath.Dir(os.Args[0])
files, _ := ioutil.ReadDir(exeDir)
for _, f := range files {
fileName := f.Name()
if extension == path.Ext(fileName) {
go callback(fileName)
}
}
}




func main() {
maxProcs := runtime.NumCPU()
runtime.GOMAXPROCS(maxProcs)


eachFile(".xml", func(fileName string) {
// Custom logic goes in here
fmt.Println(fileName)
})


// This is what i want to get rid of
time.Sleep(100 * time.Millisecond)
}
80602 次浏览

你可以使用 同步 WaitGroup。引用链接的例子:

package main


import (
"net/http"
"sync"
)


func main() {
var wg sync.WaitGroup
var urls = []string{
"http://www.golang.org/",
"http://www.google.com/",
"http://www.somestupidname.com/",
}
for _, url := range urls {
// Increment the WaitGroup counter.
wg.Add(1)
// Launch a goroutine to fetch the URL.
go func(url string) {
// Decrement the counter when the goroutine completes.
defer wg.Done()
// Fetch the URL.
http.Get(url)
}(url)
}
// Wait for all HTTP fetches to complete.
wg.Wait()
}

等待组绝对是这样做的规范方式。不过,为了完整起见,下面是 WaitGroups 引入之前通常使用的解决方案。其基本思想是使用一个通道来说“我完成了”,然后让主要的 goroutine 等待,直到每个产生的例程报告完成。

func main() {
c := make(chan struct{}) // We don't need any data to be passed, so use an empty struct
for i := 0; i < 100; i++ {
go func() {
doSomething()
c <- struct{}{} // signal that the routine has completed
}()
}


// Since we spawned 100 routines, receive 100 messages.
for i := 0; i < 100; i++ {
<- c
}
}

在这里,WaitGroup 可以帮助您。

package main


import (
"fmt"
"sync"
"time"
)




func wait(seconds int, wg * sync.WaitGroup) {
defer wg.Done()


time.Sleep(time.Duration(seconds) * time.Second)
fmt.Println("Slept ", seconds, " seconds ..")
}




func main() {
var wg sync.WaitGroup


for i := 0; i <= 5; i++ {
wg.Add(1)
go wait(i, &wg)
}
wg.Wait()
}

尽管 sync.waitGroup(wg)是规范的前进方向,但它确实需要您在 wg.Wait之前至少进行一些 wg.Add调用,以便所有人完成。这对于像 web 爬虫这样的简单事情可能是不可行的,因为您事先不知道递归调用的数量,并且检索驱动 wg.Add调用的数据需要一段时间。毕竟,在知道第一批子页的大小之前,需要加载和解析第一页。

我使用通道编写了一个解决方案,在我的解决方案 Go-web 爬虫之旅练习中避免了 waitGroup。每次开始一个或多个例行程序,您发送数字到 children频道。每当一个 go 例程即将完成时,您向 done通道发送一个 1。当孩子的总和等于完成的总和时,我们就完成了。

我唯一关心的是 results通道的硬编码大小,但这是一个(当前) Go 限制。


// recursionController is a data structure with three channels to control our Crawl recursion.
// Tried to use sync.waitGroup in a previous version, but I was unhappy with the mandatory sleep.
// The idea is to have three channels, counting the outstanding calls (children), completed calls
// (done) and results (results).  Once outstanding calls == completed calls we are done (if you are
// sufficiently careful to signal any new children before closing your current one, as you may be the last one).
//
type recursionController struct {
results  chan string
children chan int
done     chan int
}


// instead of instantiating one instance, as we did above, use a more idiomatic Go solution
func NewRecursionController() recursionController {
// we buffer results to 1000, so we cannot crawl more pages than that.
return recursionController{make(chan string, 1000), make(chan int), make(chan int)}
}


// recursionController.Add: convenience function to add children to controller (similar to waitGroup)
func (rc recursionController) Add(children int) {
rc.children <- children
}


// recursionController.Done: convenience function to remove a child from controller (similar to waitGroup)
func (rc recursionController) Done() {
rc.done <- 1
}


// recursionController.Wait will wait until all children are done
func (rc recursionController) Wait() {
fmt.Println("Controller waiting...")
var children, done int
for {
select {
case childrenDelta := <-rc.children:
children += childrenDelta
// fmt.Printf("children found %v total %v\n", childrenDelta, children)
case <-rc.done:
done += 1
// fmt.Println("done found", done)
default:
if done > 0 && children == done {
fmt.Printf("Controller exiting, done = %v, children =  %v\n", done, children)
close(rc.results)
return
}
}
}
}

解决方案的完整源代码

下面是一个使用 WaitGroup 的解决方案。

首先,定义两个实用方法:

package util


import (
"sync"
)


var allNodesWaitGroup sync.WaitGroup


func GoNode(f func()) {
allNodesWaitGroup.Add(1)
go func() {
defer allNodesWaitGroup.Done()
f()
}()
}


func WaitForAllNodes() {
allNodesWaitGroup.Wait()
}

然后,替换 callback的调用:

go callback(fileName)

通过调用您的实用函数:

util.GoNode(func() { callback(fileName) })

最后一步,将这一行添加到 main的末尾,而不是 sleep。这将确保主线程在程序停止之前等待所有例程完成。

func main() {
// ...
util.WaitForAllNodes()
}