首頁猿問使用和不使用 Go 例程加載地圖

使用和不使用 Go 例程加載地圖

梵蒂岡之花 2023-08-14 16:41:35

這是我遇到的一個有趣的情況。在使用 go 例程進行一些數據操作之后，我需要從文件中讀取數據，并根據我們發現的內容填充地圖。這是簡化的問題陳述和示例：生成運行所需的數據gen_data.sh#!/bin/bash rm some.dat || : for i in `seq 1 10000`; do echo "$i `date` tx: $RANDOM rx:$RANDOM" >> some.datdone如果我使用將這些行讀入沒有 go 例程的文件some.dat中，它會保持對齊。（因為第一個和第二個詞是相同的，請參見下面的o/p。）map[int]stringloadtoDict.go在現實生活中，我確實需要在將線條加載到地圖之前對其進行處理（昂貴），使用 go 例程可以加快我的字典創建速度，這是解決實際問題的重要要求。加載到Dict.gopackage mainimport ( "bufio" "fmt" "log" "os")var ( fileName = "some.dat")func checkerr(err error) { if err != nil { fmt.Println(err) log.Fatal(err) }}func main() { ourDict := make(map[int]string) f, err := os.Open(fileName) checkerr(err) defer f.Close() fscanner := bufio.NewScanner(f) indexPos := 1 for fscanner.Scan() { text := fscanner.Text() //fmt.Println("text", text) ourDict[indexPos] = text indexPos++ } for i, v := range ourDict { fmt.Printf("%d: %s\n", i, v) }}跑步：$ ./loadtoDict...8676: 8676 Mon Dec 23 15:52:24 PST 2019 tx: 17718 rx:11332234: 2234 Mon Dec 23 15:52:20 PST 2019 tx: 13170 rx:159623436: 3436 Mon Dec 23 15:52:21 PST 2019 tx: 17519 rx:54196177: 6177 Mon Dec 23 15:52:23 PST 2019 tx: 5731 rx:5449注意第一個和第二個單詞是如何“對齊”的。但是，如果我使用 go-routines 加載地圖，就會出錯：async_loadtoDict.go

查看完整描述

3 回答

偶然的你

TA貢獻1841條經驗獲得超3個贊

您的信號量sem不起作用，因為您對其進行了深度緩沖。

一般來說，這是為此類任務設置映射的錯誤方法，因為讀取文件將是緩慢的部分。如果您有一個更復雜的任務 - 例如，讀取一行，思考很多，設置一些東西 - 您會希望將其作為您的偽代碼結構：

type workType struct {

index int

line string

}

var wg sync.WaitGroup

wg.Add(nWorkers)

// I made this buffered originally but there's no real point, so

// fixing that in an edit

work := make(chan workType)

for i := 0; i < nWorkers; i++ {

go readAndDoWork(work, &wg)

}

for i := 1; fscanner.Scan(); i++ {

work <- workType{index: i, line: fscanner.Text()}

}

close(work)

wg.Wait()

... now your dictionary is ready ...

工人們這樣做：

func readAndDoWork(ch chan workType, wg *sync.WorkGroup) {

for item := range ch {

... do computation ...

insertIntoDict(item.index, result)

}

wg.Done()

}

獲取insertIntoDict互斥體（以保護從索引到結果的映射）并寫入字典。（如果您愿意，可以將其內聯。）

這里的想法是設置一定數量的工作線程（可能基于可用 CPU 的數量），每個工作線程獲取下一個工作項并處理它。主 Goroutine 只是分配工作，然后關閉工作通道——這將導致所有工作人員看到輸入結束——然后等待他們發出計算完成的信號。

（如果您愿意，您可以再創建一個 goroutine 來讀取工作程序計算的結果并將其放入映射中。這樣您就不需要映射本身的互斥鎖。）

反對回復 2023-08-14

弒天下

TA貢獻1818條經驗獲得超8個贊

好吧，我已經弄清楚了。通過復制來賦予 goroutine 一個值來保存，似乎是可行的。

改變：

for fscanner.Scan() {

text := fscanner.Text()

wg.Add(1)

sem <- 1

go func() {

mu.Lock()

defer mu.Unlock()

ourDict[indexPos] = text

indexPos++

<- sem

wg.Done()

}()

}

到

for fscanner.Scan() {

text := fscanner.Text()

wg.Add(1)

sem <- 1

go func(mypos int) {

mu.Lock()

defer mu.Unlock()

ourDict[mypos] = text

<-sem

wg.Done()

}(indexPos)

indexPos++

}

完整代碼： https: //play.golang.org/p/dkHaisPHyHz

使用工人池，

package main

import (

"bufio"

"fmt"

"log"

"os"

"sync"

)

const (

MAX = 10

fileName = "some.dat"

)

type gunk struct {

line string

id int

}

func main() {

ourDict := make(map[int]string)

wg := sync.WaitGroup{}

mu := sync.RWMutex{}

cha := make(chan gunk)

for i := 0; i < MAX; i++ {

wg.Add(1)

go func(id int) {

defer wg.Done()

for {

textin, ok := <-cha

if !ok {

return

}

mu.Lock()

ourDict[textin.id] = textin.line

mu.Unlock()

}

}(i)

}

f, err := os.Open(fileName)

checkerr(err)

defer f.Close()

fscanner := bufio.NewScanner(f)

indexPos := 1

for fscanner.Scan() {

text := fscanner.Text()

thisgunk := gunk{line: text, id: indexPos}

cha <- thisgunk

indexPos++

}

close(cha)

wg.Wait()

for i, v := range ourDict {

fmt.Printf("%d: %s\n", i, v)

}

func checkerr(err error) {

if err != nil {

fmt.Println(err)

log.Fatal(err)

}

反對回復 2023-08-14

躍然一笑

TA貢獻1826條經驗獲得超6個贊

正如我在評論中提到的，您無法控制 goroutine 的執行順序，因此不應從它們內部更改索引。

這是一個示例，其中與地圖的交互在單個 goroutine 中進行，而您的處理則在其他 goroutine 中進行：

package main

import (

"bufio"

"fmt"

"log"

"os"

"sync"

)

var (

fileName = "some.dat"

MAX = 9000

)

func checkerr(err error) {

if err != nil {

fmt.Println(err)

log.Fatal(err)

}

type result struct {

index int

data string

}

func main() {

ourDict := make(map[int]string)

f, err := os.Open(fileName)

checkerr(err)

defer f.Close()

fscanner := bufio.NewScanner(f)

var wg sync.WaitGroup

sem := make(chan struct{}, MAX) // Use empty structs for semaphores as they have no allocation

defer close(sem)

out := make(chan result)

defer close(out)

indexPos := 1

for fscanner.Scan() {

text := fscanner.Text()

wg.Add(1)

sem <- struct{}{}

go func(index int, data string) {

// Defer the release of your resources, otherwise if any error occur in your goroutine

// you'll have a deadlock

defer func() {

wg.Done()

<-sem

}()

// Process your data

out <- result{index, data}

}(indexPos, text) // Pass in the data that will change on the iteration, go optimizer will move it around better

indexPos++

}

// The goroutine is the only one to write to the dict, so no race condition

go func() {

for {

if entry, ok := <-out; ok {

ourDict[entry.index] = entry.data

} else {

return // Exit goroutine when channel closes

}

}()

wg.Wait()

for i, v := range ourDict {

fmt.Printf("%d: %s\n", i, v)

}

反對回復 2023-08-14

3 回答
0 關注
184 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

使用和不使用 Go 例程加載地圖

使用和不使用 Go 例程加載地圖

3 回答

添加回答