If you have read my previous posts; you might know I am currently working on a new project to move some services to a self-hosted solution. As part of this, I have been working on dealing with
unicode characters in regex.
In relation to this I have found that I am writing the same function repeatedly. The only difference being the number matches being returned. So I decided we need to refactor this.
Here is my solution
def findMatches(string, regex) -> dict: """ This is a generic matching function. Warning! Your regex expression MUST use 'Named Groups' -> (:P<name>) or this function will return an empty dictionary :param string: The text you are searching :type string: str :param regex: The regular expression string you are using to search :type regex: str :returns: A dictionary of named key/value pairs. The key value is derived \ from (:P<name>) :returns: None is returned if No match is found. :rtype: dict :rtype: None """ matcher = re.compile(regex, re.UNICODE) match = matcher.match(string) if match: matches = dict() for key in match.groupdict(): matches[key] = match.group(key) return matches # No Matches return None
It takes two arguments. The
string to be searched and the
regex to be used. I go through the basic process of making the
re object and do the match.
I then go over match objects dictionary and get the name of the keys. I use these keys to make a simple dictionary object storing the matching key, value pairs.
No more writing the same basic function repeatedly.