1 回答

TA貢獻1155條經驗 獲得超0個贊
我將兩者都放在基準測試中,兩者之間的性能幾乎相等,append速度較慢,但幾乎可以忽略不計。
package main_test
import "testing"
func BenchmarkMerge1(b *testing.B) {
for i := 0; i < b.N; i++ {
num1 := []int{1, 2, 3}
num2 := []int{4, 5, 6}
merge1(num1, len(num1), num2, len(num2))
}
}
func merge1(nums1 []int, m int, nums2 []int, n int) {
tmpSlice := make([]int, m+n)
tmpIndex := 0
index1 := 0
index2 := 0
for index1 < m {
value1 := nums1[index1]
for index2 < n {
value2 := nums2[index2]
if value1 <= value2 {
break
} else {
tmpSlice[tmpIndex] = value2 // <-- Assign
index2++
tmpIndex++
}
}
tmpSlice[tmpIndex] = value1 // <-- Assign
index1++
tmpIndex++
}
copy(nums1, tmpSlice[:tmpIndex])
copy(nums1[tmpIndex:], nums2[index2:])
}
func BenchmarkMerge2(b *testing.B) {
for i := 0; i < b.N; i++ {
num1 := []int{1, 2, 3}
num2 := []int{4, 5, 6}
merge2(num1, len(num1), num2, len(num2))
}
}
func merge2(nums1 []int, m int, nums2 []int, n int) {
tmpSlice := make([]int, 0, m+n)
tmpIndex := 0
index1 := 0
index2 := 0
for index1 < m {
value1 := nums1[index1]
for index2 < n {
value2 := nums2[index2]
if value1 <= value2 {
break
} else {
tmpSlice = append(tmpSlice, value2) // <-- Append
index2++
tmpIndex++
}
}
tmpSlice = append(tmpSlice, value1) // <-- Append
index1++
tmpIndex++
}
copy(nums1, tmpSlice[:tmpIndex])
copy(nums1[tmpIndex:], nums2[index2:])
}
Running tool: /usr/local/go/bin/go test -benchmem -run=^$ -bench ^(BenchmarkMerge1|BenchmarkMerge2)$ example.com/m
goos: linux
goarch: amd64
pkg: example.com/m
cpu: Intel(R) Core(TM) i7-10870H CPU @ 2.20GHz
BenchmarkMerge1-16 34586568 36.40 ns/op 48 B/op 1 allocs/op
BenchmarkMerge2-16 32561293 36.77 ns/op 48 B/op 1 allocs/op
PASS
ok example.com/m 2.533s
這是意料之中的,因為只要切片有容量,append 基本上就會進行分配。append還增加len切片標頭中的字段(該提示感謝@rustyx),這解釋了差異。
當沒有在切片上設置初始容量并使用追加時,您會看到更大的差異,因為它會“增長”需要時間的底層數組。
如果我們更改tmpSlice := make([]int, 0, m+n)為tmpSlice := make([]int, 0)inmerge2我們會得到以下結果:
Running tool: /usr/local/go/bin/go test -benchmem -run=^$ -bench ^(BenchmarkMerge1|BenchmarkMerge2)$ example.com/m
goos: linux
goarch: amd64
pkg: example.com/m
cpu: Intel(R) Core(TM) i7-10870H CPU @ 2.20GHz
BenchmarkMerge1-16 37319397 32.34 ns/op 48 B/op 1 allocs/op
BenchmarkMerge2-16 14543529 87.75 ns/op 56 B/op 3 allocs/op
PASS
ok example.com/m 2.604s
TL;DR,只要切片有容量,append就比分配慢(因為切片中的字段遞增)幾乎可以忽略不計len
- 1 回答
- 0 關注
- 118 瀏覽
添加回答
舉報