java - How to make regular expression to allow optional prefix and suffix extraction -


as title described, regular expression should serve purpose on extract information given string, prefix of string (optional) , suffix of string (optional)

so

prefix_group_1_suffix returns group_1 when prefix 'prefix_' , suffix _suffix

prefix_group_1 returns group_1 when prefix 'prefix_' , suffix null<-- code can't handle situation

group_1_suffix returns group_1 when prefix 'null' , suffix _suffix

group_1 returns group_1 when prefix 'null' , suffix null <-- code can't handle situation

here code, found doesn't work when

    string itemname = "";     string prefix = "test_";     string suffix = "";     string itemstring = prefix + "item_1" + suffix;     string prefix_quote = "".equals(prefix) ? "" : pattern.quote(prefix);     string suffix_quote = "".equals(suffix) ? "" : pattern.quote(suffix);     string regex = prefix_quote + "(.*?)" + suffix_quote;     pattern pattern = pattern.compile(regex);     matcher matcher = pattern.matcher(itemstring);     while (matcher.find()) {         itemname = matcher.item(1);         break;     }     system.out.println("itemstring '"+itemstring+"'");     system.out.println("prefix quote '"+prefix_quote+"'");     system.out.println("suffix quote '"+suffix_quote+"'");     system.out.println("regex '"+regex+"'");     system.out.println("itemname '"+itemname+"'");  

and here output

itemstring 'test_item_1' prefix quote '\qtest_\e' suffix quote '' regex '\qtest_\e(.*?)' itemname '' 

but above code works other 2 conditions

the reason why code fails lies in lazy quantifier .*?. it's priority match little possible, preferably empty string, that. therefore need anchor regex start/end of string , possible prefix/suffix.

for that, can use lookaround assertions:

string prefix = "test_"; string suffix = ""; string itemstring = prefix + "item_1" + suffix; string prefix_quote = "".equals(prefix) ? "^" : pattern.quote(prefix); string suffix_quote = "".equals(suffix) ? "$" : pattern.quote(suffix); string regex = "(?<=^|" + prefix_quote + ")(.*?)(?=$|" + suffix_quote + ")"; pattern pattern = pattern.compile(regex); matcher matcher = pattern.matcher(itemstring); 

this result in regex

(?<=^|test_)item_1(?=$|$) 

explanation:

(?<=    # assert it's possible match before current position  ^      # either start of string |       # or  test_  # prefix )       # end of lookbehind item_1  # match "item_1" (?=$|$) # assert it's possible match after current position         # either end of string or suffix (which replaced         # end of string if empty. of course optimized         # when constructing regex, quick-and-dirty solution). 

Comments

Popular posts from this blog

ZeroMQ on Windows, with Qt Creator -

unity3d - Unity SceneManager.LoadScene quits application -

python - Error while using APScheduler: 'NoneType' object has no attribute 'now' -