Desafiando RegEx no funciona para compañeros novatos

Quiero capturar todas las URL en un documento, pero que no sean de google,bscscan,github, etc.

Hasta ahora tengo este Regex funcionando

(www|http:|https:)+[\W]+(?!bscscan|google|binance|t\.me)[\w]+

Cuando se aplica a este párrafo

https://bscscan.com   testing123
website: https://www.yahoo.com
another one www.bing.com is great
www.binance.org
http://bob.bscscan.com
https://twitter.google.com
https://google.twitter.com
https://t.me/rawr omg

Solo coincide

1) https://www 
2) www.bing
3) http://bob
4) https:/twitter

pero quiero que coincida

https://yahoo.com
www.bing.com

Correcciones deseadas

#1) Incluya el enlace URL completo.

#2) Omita las URL que tienen CUALQUIER mención de las palabras de búsqueda negativa dentro del enlace.

Answer

Usar

\b(?:www\.|https?:)(?!\S*\b(?:bscscan|google|binance|t\.me)\b)\S+

Ver prueba de expresiones regulares .

EXPLICACIÓN

--------------------------------------------------------------------------------
  \b                       the boundary between a word char (\w) and
                           something that is not a word char
--------------------------------------------------------------------------------
  (?:                      group, but do not capture:
--------------------------------------------------------------------------------
    www                      'www'
--------------------------------------------------------------------------------
    \.                       '.'
--------------------------------------------------------------------------------
   |                        OR
--------------------------------------------------------------------------------
    http                     'http'
--------------------------------------------------------------------------------
    s?                       's' (optional (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    :                        ':'
--------------------------------------------------------------------------------
  )                        end of grouping
--------------------------------------------------------------------------------
  (?!                      look ahead to see if there is not:
--------------------------------------------------------------------------------
    \S*                      non-whitespace (all but \n, \r, \t, \f,
                             and " ") (0 or more times (matching the
                             most amount possible))
--------------------------------------------------------------------------------
    \b                       the boundary between a word char (\w)
                             and something that is not a word char
--------------------------------------------------------------------------------
    (?:                      group, but do not capture:
--------------------------------------------------------------------------------
      bscscan                  'bscscan'
--------------------------------------------------------------------------------
     |                        OR
--------------------------------------------------------------------------------
      google                   'google'
--------------------------------------------------------------------------------
     |                        OR
--------------------------------------------------------------------------------
      binance                  'binance'
--------------------------------------------------------------------------------
     |                        OR
--------------------------------------------------------------------------------
      t                        't'
--------------------------------------------------------------------------------
      \.                       '.'
--------------------------------------------------------------------------------
      me                       'me'
--------------------------------------------------------------------------------
    )                        end of grouping
--------------------------------------------------------------------------------
    \b                       the boundary between a word char (\w)
                             and something that is not a word char
--------------------------------------------------------------------------------
  )                        end of look-ahead
--------------------------------------------------------------------------------
  \S+                      non-whitespace (all but \n, \r, \t, \f,
                           and " ") (1 or more times (matching the
                           most amount possible))

Pruebe este, tiene suficientes expresiones para permitirle modificarlas en función de cómo se implemente:

/(|www\.|http\:\/\/|https\:\/\/)(?!(bscscan|google|binance|t\.me|twitter|bob))(yahoo\.com|bing\.com)/g

Esto coincidirá con cualquiera de las siguientes variaciones:

https://yahoo.com.  <- your required one
www.bing.com.       <- your required one
www.yahoo.com
https://bing.com
http://bing.com
bing.com            <- remove the "|" before "www" if you don't want this one
yahoo.com           <- remove the "|" before "www" if you don't want this one

si agrega (https\:\/\/www\.)|(http\:\/\/www\.), entonces también coincidirá https://wwwyhttp://www