javascript - How do I write a regex to search for items within UA-Parser? -
i using ua-parser create table of devices analytics...i have csv of user-agent strings our server. using stock ua-parser node package (ua-parser-js.)
however, having difficulty parsing droid user-agent strings.
current regex droid is
/\s((milestone|droid[2x]?))[globa\s]*\sbuild\//i
the above matches
mozilla/5.0 (linux; u; android 2.3.4; en-us; droidx build/4.5.1_57_dx8-51) applewebkit/533.1 (khtml, gecko) version/4.0 mobile safari/533.1,182
but not match
mozilla/5.0 (linux; u; android 4.1.2; en-us; droid razr build/9.8.2o-72_vzw-16) applewebkit/534.30 (khtml, gecko) version/4.0 mobile safari/534.30,652 mozilla/5.0 (linux; u; android 2.3.5; en-us; droid x2 build/4.5.1a-dtn-200-18) applewebkit/533.1 (khtml, gecko) version/4.0 mobile safari/533.1,152
how should modify regex filter above strings?
to solve need isolate part of string causing problem.
let's cut strings down , @ part of strings we're interested in:
droidx build
compared droid razr build
or droid x2 build
we can see match droid
, , [2x]
optional, doesn't matter.
the problem in next bit: [globa\s]
.
this not optional, , requires after word droid
(with or without following 2
or x
), have 1 or more of characters in list g
,l
,o
,b
,a
, or white space.
we have razr
, x2
in failing strings. if of characters in words not in above list, match fails. (as turns out, none of characters in list, fail single one).
so quick , easy fix here add characters r
,z
,x
, 2
globa\s
.
this fix given examples -- ie accept razr
or x2
in section of string.
however, allow other possible cases, may want bit more lenient , allow alpha-numeric characters. it's you, there's no predicting ua strings going appear in future.
so therefore, suggest replacing whole globa
a-z0-9
.
/\s((milestone|droid[2x]?))[a-z0-9\s]*\sbuild\//i
even may not pick possible variants appear, that's trouble user agent strings; they're not well-defined format; can contain pretty anything.
[edit] op adds request razr
or x2
strings included in returned result string.
the short answer mean moving relevant part of pattern bracketed section, alongside droid
pattern.
however, complicate things, because while want strings included, may not want others excluded -- ie strings matched globa\s
pattern. problem here don't have examples of excluded strings may have been, or why they're excluded. , likewise, don't know strings want include, beyond razr
or x2
. guess we'd need relatively lenient, it's not easy know how distinguish them without knowing possibilities (and indeed, may difficult when know them).
given above, real option open me suggest adding razr
, x2
bracketed section, picked specifically:
/\s((milestone|droid[2x]?(\s(razr|x2)\s)?))[a-z0-9\s]*\sbuild\//i
this match both required strings.
the problem, of course, won't match other possible variants haven't been described here. allowing more require knowing more possible variants are, since we've been asked @ these specific examples, that's can offer now.
Comments
Post a Comment