如何在不使用try/except的情况下检查字符串是否表示int ?

[英]How can I check if a string represents an int, without using try/except?

Is there any way to tell whether a string represents an integer (e.g., '3', '-17' but not '3.14' or 'asfasfas') Without using a try/except mechanism?

如果不使用try/except机制,是否有办法判断一个字符串是否表示一个整数(例如,'3','-17',而不是'3.14'或'asfasfas') ?

is_int('3.14') = False
is_int('-7')   = True

13 个解决方案



If you're really just annoyed at using try/excepts all over the place, please just write a helper function:


def RepresentsInt(s):
        return True
    except ValueError:
        return False

>>> print RepresentsInt("+123")
>>> print RepresentsInt("10.0")

It's going to be WAY more code to exactly cover all the strings that Python considers integers. I say just be pythonic on this one.




with positive integers you could use .isdigit:


>>> '16'.isdigit()

it doesn't work with negative integers though. suppose you could try the following:


>>> s = '-17'
>>> s.startswith('-') and s[1:].isdigit()

it won't work with '16.0' format, which is similar to int casting in this sense.

它不能使用‘16.0’格式,在这个意义上,它类似于int casting。



def check_int(s):
    if s[0] in ('-', '+'):
        return s[1:].isdigit()
    return s.isdigit()



You know, I've found (and I've tested this over and over) that try/except does not perform all that well, for whatever reason. I frequently try several ways of doing things, and I don't think I've ever found a method that uses try/except to perform the best of those tested, in fact it seems to me those methods have usually come out close to the worst, if not the worst. Not in every case, but in many cases. I know a lot of people say it's the "Pythonic" way, but that's one area where I part ways with them. To me, it's neither very performant nor very elegant, so, I tend to only use it for error trapping and reporting.


I was going to gripe that PHP, perl, ruby, C, and even the freaking shell have simple functions for testing a string for integer-hood, but due diligence in verifying those assumptions tripped me up! Apparently this lack is a common sickness.


Here's a quick and dirty edit of Bruno's post:


import sys, time, re

g_intRegex = re.compile(r"[-+]?\d+(\.0*)?$")

testvals = [
    # integers
    0, 1, -1, 1.0, -1.0,
    '0', '0.','0.0', '1', '-1', '+1', '1.0', '-1.0', '+1.0', '06',
    # non-integers
    1.1, -1.1, '1.1', '-1.1', '+1.1',
    '1.1.1', '1.1.0', '1.0.1', '1.0.0',
    '1.0.', '1..0', '1..',
    '0.0.', '0..0', '0..',
    'one', object(), (1,2,3), [1,2,3], {'one':'two'},
    # with spaces
    ' 0 ', ' 0.', ' .0','.01 '

def isInt_try(v):
    try:     i = int(v)
    except:  return False
    return True

def isInt_str(v):
    v = str(v).strip()
    return v=='0' or (v if v.find('..') > -1 else v.lstrip('-+').rstrip('0').rstrip('.')).isdigit()

def isInt_re(v):
    import re
    if not hasattr(isInt_re, 'intRegex'):
        isInt_re.intRegex = re.compile(r"[-+]?\d+(\.0*)?$")
    return isInt_re.intRegex.match(str(v).strip()) is not None

def isInt_re2(v):
    return g_intRegex.match(str(v).strip()) is not None

def timeFunc(func, times):
    t1 = time.time()
    for n in xrange(times):
        for v in testvals: 
            r = func(v)
    t2 = time.time()
    return t2 - t1

def testFuncs(funcs):
    for func in funcs:
        sys.stdout.write( "\t%s\t|" % func.__name__)
    for v in testvals:
        sys.stdout.write("%s" % str(v))
        for func in funcs:
            sys.stdout.write( "\t\t%s\t|" % func(v))

if __name__ == '__main__':
    print "tests.."
    testFuncs((isInt_try, isInt_str, isInt_re, isInt_re2))

    print "timings.."
    print "isInt_try:   %6.4f" % timeFunc(isInt_try, 10000)
    print "isInt_str:   %6.4f" % timeFunc(isInt_str, 10000) 
    print "isInt_re:    %6.4f" % timeFunc(isInt_re, 10000)
    print "isInt_re2:   %6.4f" % timeFunc(isInt_re2, 10000)

Here's the interesting part of the output:


isInt_try:   1.2454
isInt_str:   0.7878
isInt_re:    1.5731
isInt_re2:   0.8087

As you can see, the string method is the fastest. It is almost twice as fast as the regex method that avoids relying on any globals, and more than half again faster than the try:except method. The regex method that relies on some globals (or, well, module attributes) is a close second.


I think of these, my choice would be


isInt = isInt_str

But eh.. this is copying and recopying and recopying the entire string! (And yet it's the fastest method!?) A C method could scan it Once Through, and be done. A C method that scans the string once through would be the Right Thing to do, I think? I guess there might be some string encoding issues to deal with.. Anyway, I'd try and work one out now, but I'm out of time for this. =( Maybe I'll come back to it later.

但是呃. .这是复制、重选和重选整个字符串!(但这是最快的方法!)一个C方法可以扫描它一次,然后完成。C方法扫描字符串一次,我认为是正确的做法?我想可能会有一些字符串编码问题需要处理。不管怎么说,我现在就试着解决这个问题,但是我没时间了。(也许我过会儿再谈。)



Use a regular expression:


import re
def RepresentsInt(s):
    return re.match(r"[-+]?\d+$", s) is not None

If you must accept decimal fractions also:


def RepresentsInt(s):
    return re.match(r"[-+]?\d+(\.0*)?$", s) is not None

For improved performance if you're doing this often, compile the regular expression only once using re.compile().




The proper RegEx solution would combine the ideas of Greg Hewgill and Nowell, but not use a global variable. You can accomplish this by attaching an attribute to the method. Also, I know that it is frowned upon to put imports in a method, but what I'm going for is a "lazy module" effect like http://peak.telecommunity.com/DevCenter/Importing#lazy-imports

正确的RegEx解决方案将结合Greg Hewgill和Nowell的想法,但不使用全局变量。您可以通过将属性附加到方法来实现这一点。另外,我知道将导入放到一个方法中是不允许的,但是我想要的是一个“惰性模块”效果,比如http://peak.telecommunity.com/devcenter/imports #lazy-imports

edit: My favorite technique so far is to use exclusively methods of the String object.


#!/usr/bin/env python

# Uses exclusively methods of the String object
def isInteger(i):
    i = str(i)
    return i=='0' or (i if i.find('..') > -1 else i.lstrip('-+').rstrip('0').rstrip('.')).isdigit()

# Uses re module for regex
def isIntegre(i):
    import re
    if not hasattr(isIntegre, '_re'):
        print("I compile only once. Remove this line when you are confident in that.")
        isIntegre._re = re.compile(r"[-+]?\d+(\.0*)?$")
    return isIntegre._re.match(str(i)) is not None

# When executed directly run Unit Tests
if __name__ == '__main__':
    for obj in [
                # integers
                0, 1, -1, 1.0, -1.0,
                '0', '0.','0.0', '1', '-1', '+1', '1.0', '-1.0', '+1.0',
                # non-integers
                1.1, -1.1, '1.1', '-1.1', '+1.1',
                '1.1.1', '1.1.0', '1.0.1', '1.0.0',
                '1.0.', '1..0', '1..',
                '0.0.', '0..0', '0..',
                'one', object(), (1,2,3), [1,2,3], {'one':'two'}
        # Notice the integre uses 're' (intended to be humorous)
        integer = ('an integer' if isInteger(obj) else 'NOT an integer')
        integre = ('an integre' if isIntegre(obj) else 'NOT an integre')
        # Make strings look like strings in the output
        if isinstance(obj, str):
            obj = ("'%s'" % (obj,))
        print("%30s is %14s is %14s" % (obj, integer, integre))

And for the less adventurous members of the class, here is the output:


I compile only once. Remove this line when you are confident in that.
                             0 is     an integer is     an integre
                             1 is     an integer is     an integre
                            -1 is     an integer is     an integre
                           1.0 is     an integer is     an integre
                          -1.0 is     an integer is     an integre
                           '0' is     an integer is     an integre
                          '0.' is     an integer is     an integre
                         '0.0' is     an integer is     an integre
                           '1' is     an integer is     an integre
                          '-1' is     an integer is     an integre
                          '+1' is     an integer is     an integre
                         '1.0' is     an integer is     an integre
                        '-1.0' is     an integer is     an integre
                        '+1.0' is     an integer is     an integre
                           1.1 is NOT an integer is NOT an integre
                          -1.1 is NOT an integer is NOT an integre
                         '1.1' is NOT an integer is NOT an integre
                        '-1.1' is NOT an integer is NOT an integre
                        '+1.1' is NOT an integer is NOT an integre
                       '1.1.1' is NOT an integer is NOT an integre
                       '1.1.0' is NOT an integer is NOT an integre
                       '1.0.1' is NOT an integer is NOT an integre
                       '1.0.0' is NOT an integer is NOT an integre
                        '1.0.' is NOT an integer is NOT an integre
                        '1..0' is NOT an integer is NOT an integre
                         '1..' is NOT an integer is NOT an integre
                        '0.0.' is NOT an integer is NOT an integre
                        '0..0' is NOT an integer is NOT an integre
                         '0..' is NOT an integer is NOT an integre
                         'one' is NOT an integer is NOT an integre
<object object at 0x103b7d0a0> is NOT an integer is NOT an integre
                     (1, 2, 3) is NOT an integer is NOT an integre
                     [1, 2, 3] is NOT an integer is NOT an integre
                {'one': 'two'} is NOT an integer is NOT an integre



Greg Hewgill's approach was missing a few components: the leading "^" to only match the start of the string, and compiling the re beforehand. But this approach will allow you to avoid a try: exept:


import re
INT_RE = re.compile(r"^[-]?\d+$")
def RepresentsInt(s):
    return INT_RE.match(str(s)) is not None

I would be interested why you are trying to avoid try: except?




>>> "+7".lstrip("-+").isdigit()
>>> "-7".lstrip("-+").isdigit()
>>> "7".lstrip("-+").isdigit()
>>> "13.4".lstrip("-+").isdigit()

So your function would be:


def is_int(val):
   return val[1].isdigit() and val.lstrip("-+").isdigit()



I think


s.startswith('-') and s[1:].isdigit()

would be better to rewrite to:


s.replace('-', '').isdigit()

because s[1:] also creates a new string


But much better solution is





This is probably the most straightforward and pythonic way to approach it in my opinion. I didn't see this solution and it's basically the same as the regex one, but without the regex.


def is_int(test):
    import string
    return not (set(test) - set(string.digits))



Here is a function that parses without raising errors. It handles obvious cases returns None on failure (handles up to 2000 '-/+' signs by default on CPython!):


#!/usr/bin/env python

def get_int(number):
    splits = number.split('.')
    if len(splits) > 2:
        # too many splits
        return None
    if len(splits) == 2 and splits[1]:
        # handle decimal part recursively :-)
        if get_int(splits[1]) != 0:
            return None

    int_part = splits[0].lstrip("+")
    if int_part.startswith('-'):
        # handle minus sign recursively :-)
        return get_int(int_part[1:]) * -1
    # successful 'and' returns last truth-y value (cast is always valid)
    return int_part.isdigit() and int(int_part)

Some tests:


tests = ["0", "0.0", "0.1", "1", "1.1", "1.0", "-1", "-1.1", "-1.0", "-0", "--0", "---3", '.3', '--3.', "+13", "+-1.00", "--+123", "-0.000"]

for t in tests:
    print "get_int(%s) = %s" % (t, get_int(str(t)))



get_int(0) = 0
get_int(0.0) = 0
get_int(0.1) = None
get_int(1) = 1
get_int(1.1) = None
get_int(1.0) = 1
get_int(-1) = -1
get_int(-1.1) = None
get_int(-1.0) = -1
get_int(-0) = 0
get_int(--0) = 0
get_int(---3) = -3
get_int(.3) = None
get_int(--3.) = 3
get_int(+13) = 13
get_int(+-1.00) = -1
get_int(--+123) = 123
get_int(-0.000) = 0

For your needs you can use:


def int_predicate(number):
     return get_int(number) is not None



I have one possibility that doesn't use int at all, and should not raise an exception unless the string does not represent a number



It should work for any kind of string that float accepts, positive, negative, engineering notation...




I really liked Shavais' post, but I added one more test case ( & the built in isdigit() function):


def isInt_loop(v):
    v = str(v).strip()
    # swapping '0123456789' for '9876543210' makes nominal difference (might have because '1' is toward the beginning of the string)
    numbers = '0123456789'
    for i in v:
        if i not in numbers:
            return False
    return True

def isInt_Digit(v):
    v = str(v).strip()
    return v.isdigit()

and it significantly consistently beats the times of the rest:


isInt_try:   0.4628
isInt_str:   0.3556
isInt_re:    0.4889
isInt_re2:   0.2726
isInt_loop:   0.1842
isInt_Digit:   0.1577

using normal 2.7 python:

使用python 2.7正常:

$ python --version
Python 2.7.10

Both the two test cases I added (isInt_loop and isInt_digit) pass the exact same test cases (they both only accept unsigned integers), but I thought that people could be more clever with modifying the string implementation (isInt_loop) opposed to the built in isdigit() function, so I included it, even though there's a slight difference in execution time. (and both methods beat everything else by a lot, but don't handle the extra stuff: "./+/-" )

我添加了两个测试用例(isInt_loop和isInt_digit)通过相同的测试用例(他们都只接受无符号整数),但我认为人们会更加聪明与修改字符串实现(isInt_loop)反对建在isdigit()函数,所以我有,即使有细微差别,执行时间。(这两种方法都比其他任何方法都强,但不要处理额外的东西:“。/ + / -)

Also, I did find it interesting to note that the regex (isInt_re2 method) beat the string comparison in the same test that was performed by Shavais in 2012 (currently 2018). Maybe the regex libraries have been improved?

另外,我发现有趣的是,regex (isInt_re2方法)在2012年Shavais进行的相同测试(目前为2018年)中击败了字符串比较。也许regex库已经得到了改进?



Uh.. Try this:

呃. .试试这个:

def int_check(a):
    if int(a) == a:
        return True
        return False

This works if you don't put a string that's not a number.


And also (I forgot to put the number check part. ), there is a function checking if the string is a number or not. It is str.isdigit(). Here's an example:


a = 2

If you call a.isdigit(), it will return True.





© 2014-2019 ITdaan.com 粤ICP备14056181号