如何在Nginx location block部分使用regex?

pinkon5k  于 2023-06-05  发布在  Nginx
关注(0)|答案(1)|浏览(312)

Nginx regex location syntax

Regex表达式可以与Nginx location block section一起使用,这是用PCRE引擎实现的。
这个特性究竟支持什么,因为它没有完全文档化?

hec6srdp

hec6srdp1#

Nginx位置:

Nginx的location block部分有一个搜索顺序,一个修饰符,一个隐式匹配类型和一个隐式切换是否在匹配时停止搜索。下面的数组描述了正则表达式。

# --------------------------------------------------------------------------------------------------------------------------------------------
# Search-Order       Modifier       Description                                                        Match-Type        Stops-search-on-match
# --------------------------------------------------------------------------------------------------------------------------------------------
#     1st               =           The URI must match the specified pattern exactly                  Simple-string              Yes
#     2nd               ^~          The URI must begin with the specified pattern                     Simple-string              Yes
#     3rd             (None)        The URI must begin with the specified pattern                     Simple-string               No
#     4th               ~           The URI must be a case-sensitive match to the specified Rx      Perl-Compatible-Rx      Yes (first match)                 
#     4th               ~*          The URI must be a case-insensitive match to the specified Rx    Perl-Compatible-Rx      Yes (first match)
#     N/A               @           Defines a named location block.                                   Simple-string              Yes
# --------------------------------------------------------------------------------------------------------------------------------------------

捕获组:

支持捕获组、表达式求值(),本例location ~ ^/(?:index|update)$匹配以 example.com/indexexample.com/update 结尾的url

# -----------------------------------------------------------------------------------------
#    ()    : Group/Capturing-group, capturing mean match and retain/output/use what matched
#            the patern inside (). the default bracket mode is "capturing group" while (?:) 
#            is a non capturing group. example (?:a|b) match a or b in a non capturing mode
# ----------------------------------------------------------------------------------------- 
#    ?:    : Non capturing group
#    ?=    : Positive look ahead 
#    ?!    : is for negative look ahead (do not match the following...)
#    ?<=   : is for positive look behind
#    ?<!   : is for negative look behind
# -----------------------------------------------------------------------------------------

正斜杠:

不要与正则表达式斜杠\混淆,在nginx中,正斜杠/用于匹配任何子位置,包括none example location /。在正则表达式支持的上下文中,以下解释适用

# -----------------------------------------------------------------------------------------
#     /    : It doesn't actually do anything. In Javascript, Perl and some other languages, 
#            it is used as a delimiter character explicitly for regular expressions.
#            Some languages like PHP use it as a delimiter inside a string, 
#            with additional options passed at the end, just like Javascript and Perl.
#            Nginx does not use delimiter, / can be escaped with \/ for code portability 
#            purpose BUT this is not required for nginx / are handled literally 
#            (don't have other meaning than /)
# -----------------------------------------------------------------------------------------

斜杠:

正则表达式特殊字符\的第一个目的是转义下一个字符;但请注意,在大多数情况下,\后跟一个字符具有不同的含义,完整的列表是available here
Nginx不需要转义正斜杠/,它也不拒绝转义它,就像我们可以转义任何其他字符一样。因此\/被转换/匹配/。在nginx的上下文中转义正斜杠的一个目的可能是为了代码的可移植性。

其他正则字符

下面是可以使用的正则表达式的非穷举列表

# -----------------------------------------------------------------------------------------
#     ~     : Enable regex mode for location (in regex ~ mean case-sensitive match)
#     ~*    : case-insensitive match
#     |     : Or
#     ()    : Match group or evaluate the content of ()
#     $     : the expression must be at the end of the evaluated text 
#             (no char/text after the match) $ is usually used at the end of a regex 
#             location expression. 
#     ?     : Check for zero or one occurrence of the previous char ex jpe?g
#     ^~    : The match must be at the beginning of the text, note that nginx will not perform 
#             any further regular expression match even if an other match is available 
#             (check the table above); ^ indicate that the match must be at the start of 
#             the uri text, while ~ indicates a regular expression match mode.
#             example (location ^~ /realestate/.*)
#             Nginx evaluation exactly this as don't check regexp locations if this 
#             location is longest prefix match.
#     =     : Exact match, no sub folders (location = /)
#     ^     : Match the beginning of the text (opposite of $). By itself, ^ is a 
#             shortcut for all paths (since they all have a beginning).
#     .*    : Match zero, one or more occurrence of any char
#     \     : Escape the next char
#     .     : Any char 
#     *     : Match zero, one or more occurrence of the previous char
#     !     : Not (negative look ahead)
#     {}    : Match a specific number of occurrence ex. [0-9]{3} match 342 but not 32
#             {2,4} match length of 2, 3 and 4
#     +     : Match one or more occurrence of the previous char 
#     []    : Match any char inside
# --------------------------------------------------------------------------------------------

示例:

location ~ ^/(?:index)\.php(?:$|/)
location ~ ^\/(?:core\/img\/background.png|core\/img\/favicon.ico)(?:$|\/)
location ~ ^/(?:index|core/ajax/update|ocs/v[12]|status|updater/.+|oc[ms]-provider/.+)\.php(?:$|/)

相关问题