首頁猿問有沒有辦法讓...

有沒有辦法讓 content.replace 將它們分成比這些更多的單詞？

JavaScript

一只斗牛犬 2023-12-14 16:43:54

查看完整描述

1 回答

海綿寶寶撒

TA貢獻1809條經驗獲得超8個贊

好吧，我不確定你想做什么.match(/\w+|\s+|[^\s\w]+/g)。這是一些不必要的正則表達式，只是為了獲取單詞和空格的數組。如果有人把他們的臟話分成“this”之類的東西，那根本就行不通。

如果您希望過濾器不區分大小寫并考慮空格/特殊字符，更好的解決方案可能需要多個正則表達式，并對拆分字母和正常的壞詞檢查進行單獨檢查。而且你需要確保你的拆分字母檢查是準確的，否則像“洗它”這樣的詞可能會被認為是一個壞詞，盡管單詞之間有空格。

一個辦法

所以這是一個可能的解決方案。請注意，這只是一個解決方案，遠非唯一的解決方案。我將使用硬編碼的字符串示例而不是message.content, 來允許它出現在工作片段中：

//Our array of bad words

var badWords = [

'bannedWord1',

'bannedWord2',

'bannedWord3',

'bannedWord4'

];

//A function that tests if a given string contains a bad word

function testProfanity(string) {

//Removes all non-letter, non-digit, and non-space chars

var normalString = string.replace(/[^a-zA-Z0-9 ]/g, "");

//Replaces all non-letter, non-digit chars with spaces

var spacerString = string.replace(/[^a-zA-Z0-9]/g, " ");

//Checks if a condition is true for at least one element in badWords

return badWords.some(swear => {

//Removes any non-letter, non-digit chars from the bad word (for normal)

var filtered = swear.replace(/\W/g, "");

//Splits the bad word into a 's p a c e d' word (for spaced)

var spaced = filtered.split("").join(" ");

//Two different regexes for normal and spaced bad word checks

var checks = {

spaced: new RegExp(`\\b${spaced}\\b`, "gi"),

normal: new RegExp(`\\b${filtered}\\b`, "gi")

};

//If the normal or spaced checks are true in the string, return true

//so that '.some()' will return true for satisfying the condition

return spacerString.match(checks.spaced) || normalString.match(checks.normal);

});

}

var result;

//Includes one banned word; expected result: true

var test1 = "I am a bannedWord1";

result = testProfanity(test1);

console.log(result);

//Includes one banned word; expected result: true

var test2 = "I am a b a N_N e d w o r d 2";

result = testProfanity(test2);

console.log(result);

//Includes one banned word; expected result: true

var test3 = "A bann_eD%word4, I am";

result = testProfanity(test3);

console.log(result);

//Includes no banned words; expected result: false

var test4 = "No banned words here";

result = testProfanity(test4);

console.log(result);

//This is a tricky one. 'bannedWord2' is technically present in this string,

//but is 'bannedWord22' really the same? This prevents something like

//"wash it" from being labeled a bad word; expected result: false

var test5 = "Banned word 22 isn't technically on the list of bad words...";

result = testProfanity(test5);

console.log(result);

我已經對每一行進行了徹底的注釋，以便您了解我在每一行中所做的事情。又是這樣，沒有評論或測試部分：

var badWords = [

'bannedWord1',

'bannedWord2',

'bannedWord3',

'bannedWord4'

];

function testProfanity(string) {

var normalString = string.replace(/[^a-zA-Z0-9 ]/g, "");

var spacerString = string.replace(/[^a-zA-Z0-9]/g, " ");

return badWords.some(swear => {

var filtered = swear.replace(/\W/g, "");

var spaced = filtered.split("").join(" ");

var checks = {

spaced: new RegExp(`\\b${spaced}\\b`, "gi"),

normal: new RegExp(`\\b${filtered}\\b`, "gi")

};

return spacerString.match(checks.spaced) || normalString.match(checks.normal);

});

}

解釋

正如您所看到的，該過濾器能夠處理各種標點符號、大寫字母，甚至是不良單詞字母之間的單個空格/符號。但是，請注意，為了避免我描述的“清洗”場景（可能導致無意中刪除干凈的消息），我這樣做是為了避免將“bannedWord22”之類的內容與“bannedWord2”視為相同的內容。如果您希望它執行相反的操作（因此將“bannedWord22”與“bannedWord2”視為相同），則必須刪除\\b正常檢查的正則表達式中的兩個短語。

我還將解釋正則表達式，以便您完全理解這里發生的事情：

[^a-zA-Z0-9 ]表示“選擇不在 az、AZ、0-9 或空格范圍內的任何字符”（意味著不在這些指定范圍內的所有字符將被替換為空字符串，本質上是從字符串中刪除它們）。
\W表示“選擇不是單詞字符的任何字符”，其中“單詞字符”是指 az、AZ、0-9 和下劃線范圍內的字符。
\b意思是“單詞邊界”，本質上指示單詞何時開始或停止。這包括空格、行首和行尾。\b被附加轉義\（成為\\b），以防止 javascript 將正則表達式標記與字符串的轉義序列混淆。
兩個正則表達式檢查中使用的標志g和i分別表示“全局”和“不區分大小寫”。

當然，要使其與您的不和諧機器人一起工作，您在消息處理程序中所要做的就是這樣（并且一定要替換badWords為filter中的變量testProfanity()）：

if (testProfanity(message.content)) return message.delete();

如果您想了解有關正則表達式的更多信息，或者如果您想擺弄它和/或測試它，這是一個很好的資源。

反對回復 2023-12-14

1 回答
0 關注
205 瀏覽

關注

添加回答

舉報

0/150

提交

取消

亚洲在线久爱草,狠狠天天香蕉网,天天搞日日干久草,伊人亚洲日本欧美

熱搜

最近搜索清空

有沒有辦法讓 content.replace 將它們分成比這些更多的單詞？

有沒有辦法讓 content.replace 將它們分成比這些更多的單詞？

1 回答

添加回答

有沒有辦法讓 content.replace 將它們分成比這些更多的單詞？

有沒有辦法讓 content.replace 將它們分成比這些更多的單詞？