2 回答

TA貢獻1806條經驗 獲得超8個贊
您可以立即進行過濾,將條件更改if ($perc != 100)為if ($perc > 20),以便只保留您想要刪除的類似帖子。然后,您甚至可以完全跳過存儲相似性,因為您已經有了要刪除的帖子 ID 數組列表。
所以,當你有這樣的代碼時:
if ($perc > 20) {
$similarityPercentageArr[$currentPost['ID']][] = $comparePost['ID'];
}
然后,您可以像這樣刪除所有不需要的帖子:
$postsToRemove = [];
$postsToKeep = [];
foreach ($similarityPercentageArr as $postId => $similarPostIds) {
// this post has already appeared as similar somewhere, so its similar posts have already been added
if (in_array($postId, $postsToRemove)) {
continue;
}
$postsToKeep[] = $postId;
$postsToRemove = array_merge($postsToRemove, $similarPostIds);
}
現在您在 中擁有原始帖子 ID $postsToKeep,以及在 中的相似之處的 ID $postsToRemove。
我還會稍微優化一下代碼,這樣similar_text當您知道您正在將帖子與其自身進行比較時,您根本不會調用。因此,if (!is_null($comparePost['ID']))您將擁有if (!is_null($comparePost['ID']) && $comparePost['ID'] !== $currentPost['ID']).

TA貢獻1817條經驗 獲得超14個贊
similar_text — Calculate the similarity between two strings
levenshtein — Calculate Levenshtein distance between two strings
soundex — Calculate the soundex key of a string
關于您的問題,在閱讀后,似乎標題與您的查詢不太匹配!
僅僅通過另一個條件還不夠嗎?
<?php
$posts = [
'post_count' => 3,
'posts' => [
[
'ID' => 1,
'post_content' => "Wrong do point avoid by fruit learn or in death. So passage however besides invited comfort elderly be me. Walls began of child civil am heard hoped my. Satisfied pretended mr on do determine by.",
],
[
'ID' => 2,
'post_content' => "Lorem ipsum dolor sit"
],
[
'ID' => 3,
'post_content' => "Months on ye at by esteem desire warmth former. Sure that that way gave any fond now. His boy middleton sir nor engrossed affection excellent."
],
[
'ID' => 4,
'post_content' => "Lorem ipsum dolor sit"
],
]
];
print_r($posts);
function getNonSimilarTexts($posts)
{
$similarityPercentageArr = array();
for ($i = 0; $i <= $posts['post_count']; $i++) {
// $posts->the_post();
$currentPost = $posts['posts'][$i];
if (!is_null($currentPost['ID'])) {
for ($y = 0; $y <= $posts['post_count']; $y++) {
$comparePost = $posts['posts'][$y];
if (!is_null($comparePost['ID'])) {
similar_text(strip_tags($currentPost['post_content']), strip_tags($comparePost['post_content']), $perc);
// similarity is 100 if self compare and more than 20
if ($perc != 100 && $perc > 20) {
array_push($similarityPercentageArr, [$currentPost['ID'], $comparePost['ID'], $perc]);
}
}
}
}
}
return $similarityPercentageArr;
}
$p = getNonSimilarTexts($posts);
print_r($p);
輸出:
Array
(
[0] => Array
(
[0] => 1
[1] => 3
[2] => 23.145400593472
)
)
- 2 回答
- 0 關注
- 153 瀏覽
添加回答
舉報