Zanurkuj w Pythonie/Analiza przypadku: Przetwarzanie numerów telefonów: Różnice pomiędzy wersjami
Usunięta treść Dodana treść
mNie podano opisu zmian |
→Case study: Parsing Phone Numbers: tłumaczenia nagłówków |
||
Linia 1:
{{Podświetl|py}}
== Analiza przypadku: Przetwarzanie numerów telefonów ==
So far you've concentrated on matching whole patterns. Either the pattern matches, or it doesn't. But regular expressions are much more powerful than that. When a regular expression does match, you can pick out specific pieces of it. You can find out what matched where.
Linia 20:
Let's work through developing a solution for phone number parsing. This example shows the first step.
Przykład 7.10. Odnajdywanie numerów
>>> phonePattern = re.compile(r'^(\d{3})-(\d{3})-(\d{4})$') #(1)
Linia 31 ⟶ 32:
# To get access to the groups that the regular expression parser remembered along the way, use the groups() method on the object that the search function returns. It will return a tuple of however many groups were defined in the regular expression. In this case, you defined three groups, one with three digits, one with three digits, and one with four digits.
# This regular expression is not the final answer, because it doesn't handle a phone number with an extension on the end. For that, you'll need to expand the regular expression.
Przykład 7.11. Odnajdywanie numeru wewnętrznego
>>> phonePattern = re.compile(r'^(\d{3})-(\d{3})-(\d{4})-(\d+)$') #(1)
Linia 48 ⟶ 50:
The next example shows the regular expression to handle separators between the different parts of the phone number.
'''
>>> phonePattern = re.compile(r'^(\d{3})\D+(\d{3})\D+(\d{4})\D+(\d+)$') #(1)
Linia 68 ⟶ 70:
The next example shows the regular expression for handling phone numbers without separators.
'''
>>> phonePattern = re.compile(r'^(\d{3})\D*(\d{3})\D*(\d{4})\D*(\d*)$') #(1)
>>> phonePattern.search('80055512121234').groups() #(2)
Linia 87 ⟶ 89:
The next example shows how to handle leading characters in phone numbers.
'''Example 7.14.
>>> phonePattern = re.compile(r'^\D*(\d{3})\D*(\d{3})\D*(\d{4})\D*(\d*)$') #(1)
Linia 104 ⟶ 106:
Let's back up for a second. So far the regular expressions have all matched from the beginning of the string. But now you see that there may be an indeterminate amount of stuff at the beginning of the string that you want to ignore. Rather than trying to match it all just so you can skip over it, let's take a different approach: don't explicitly match the beginning of the string at all. This approach is shown in the next example.
'''
>>> phonePattern = re.compile(r'(\d{3})\D*(\d{3})\D*(\d{4})\D*(\d*)$') #(1)
Linia 123 ⟶ 125:
While you still understand the final answer (and it is the final answer; if you've discovered a case it doesn't handle, I don't want to know about it), let's write it out as a verbose regular expression, before you forget why you made the choices you made.
'''
<nowiki>>>> phonePattern = re.compile(r'''
|