如何在不使用try/except的情况下检查字符串是否表示int ?

[英]How can I check if a string represents an int, without using try/except?


Is there any way to tell whether a string represents an integer (e.g., '3', '-17' but not '3.14' or 'asfasfas') Without using a try/except mechanism?

如果不使用try/except机制,是否有办法判断一个字符串是否表示一个整数(例如,'3','-17',而不是'3.14'或'asfasfas') ?

is_int('3.14') = False
is_int('-7')   = True

13 个解决方案

#1


264  

If you're really just annoyed at using try/excepts all over the place, please just write a helper function:

如果你真的很讨厌到处使用try/excepts,请编写一个助手函数:

def RepresentsInt(s):
    try: 
        int(s)
        return True
    except ValueError:
        return False

>>> print RepresentsInt("+123")
True
>>> print RepresentsInt("10.0")
False

It's going to be WAY more code to exactly cover all the strings that Python considers integers. I say just be pythonic on this one.

它将会有更多的代码来覆盖Python认为整数的所有字符串。我说的是,这是毕达哥拉斯式的。

#2


460  

with positive integers you could use .isdigit:

对于正整数,你可以使用。isdigit:

>>> '16'.isdigit()
True

it doesn't work with negative integers though. suppose you could try the following:

但它不与负整数有关。假设你可以试试下面的方法:

>>> s = '-17'
>>> s.startswith('-') and s[1:].isdigit()
True

it won't work with '16.0' format, which is similar to int casting in this sense.

它不能使用‘16.0’格式,在这个意义上,它类似于int casting。

edit:

编辑:

def check_int(s):
    if s[0] in ('-', '+'):
        return s[1:].isdigit()
    return s.isdigit()

#3


59  

You know, I've found (and I've tested this over and over) that try/except does not perform all that well, for whatever reason. I frequently try several ways of doing things, and I don't think I've ever found a method that uses try/except to perform the best of those tested, in fact it seems to me those methods have usually come out close to the worst, if not the worst. Not in every case, but in many cases. I know a lot of people say it's the "Pythonic" way, but that's one area where I part ways with them. To me, it's neither very performant nor very elegant, so, I tend to only use it for error trapping and reporting.

你知道,我发现(我反复测试过这个)try/except不管出于什么原因,表现都不是很好。我经常尝试几种方法来做事情,我认为我还没有找到一种方法来使用try/除了执行那些测试中最好的,事实上,在我看来,这些方法通常是最坏的,如果不是最坏的话。不是在所有情况下,但在很多情况下。我知道很多人都说这是“毕达哥拉斯式”的方式,但那是我和他们分道扬镳的地方。对我来说,它既不是很出色也不是很优雅,因此,我倾向于将它用于错误捕获和报告。

I was going to gripe that PHP, perl, ruby, C, and even the freaking shell have simple functions for testing a string for integer-hood, but due diligence in verifying those assumptions tripped me up! Apparently this lack is a common sickness.

我要抱怨的是,PHP、perl、ruby、C,甚至是糟糕的shell都有简单的函数来测试字符串是否为整数,但在验证这些假设方面的尽职调查让我感到困惑!显然,这种缺乏是一种常见的疾病。

Here's a quick and dirty edit of Bruno's post:

下面是对布鲁诺的帖子的快速而又肮脏的编辑:

import sys, time, re

g_intRegex = re.compile(r"[-+]?\d+(\.0*)?$")

testvals = [
    # integers
    0, 1, -1, 1.0, -1.0,
    '0', '0.','0.0', '1', '-1', '+1', '1.0', '-1.0', '+1.0', '06',
    # non-integers
    1.1, -1.1, '1.1', '-1.1', '+1.1',
    '1.1.1', '1.1.0', '1.0.1', '1.0.0',
    '1.0.', '1..0', '1..',
    '0.0.', '0..0', '0..',
    'one', object(), (1,2,3), [1,2,3], {'one':'two'},
    # with spaces
    ' 0 ', ' 0.', ' .0','.01 '
]

def isInt_try(v):
    try:     i = int(v)
    except:  return False
    return True

def isInt_str(v):
    v = str(v).strip()
    return v=='0' or (v if v.find('..') > -1 else v.lstrip('-+').rstrip('0').rstrip('.')).isdigit()

def isInt_re(v):
    import re
    if not hasattr(isInt_re, 'intRegex'):
        isInt_re.intRegex = re.compile(r"[-+]?\d+(\.0*)?$")
    return isInt_re.intRegex.match(str(v).strip()) is not None

def isInt_re2(v):
    return g_intRegex.match(str(v).strip()) is not None

def timeFunc(func, times):
    t1 = time.time()
    for n in xrange(times):
        for v in testvals: 
            r = func(v)
    t2 = time.time()
    return t2 - t1

def testFuncs(funcs):
    for func in funcs:
        sys.stdout.write( "\t%s\t|" % func.__name__)
    print
    for v in testvals:
        sys.stdout.write("%s" % str(v))
        for func in funcs:
            sys.stdout.write( "\t\t%s\t|" % func(v))
        print 

if __name__ == '__main__':
    print
    print "tests.."
    testFuncs((isInt_try, isInt_str, isInt_re, isInt_re2))
    print

    print "timings.."
    print "isInt_try:   %6.4f" % timeFunc(isInt_try, 10000)
    print "isInt_str:   %6.4f" % timeFunc(isInt_str, 10000) 
    print "isInt_re:    %6.4f" % timeFunc(isInt_re, 10000)
    print "isInt_re2:   %6.4f" % timeFunc(isInt_re2, 10000)

Here's the interesting part of the output:

这里是输出的有趣部分:

timings..
isInt_try:   1.2454
isInt_str:   0.7878
isInt_re:    1.5731
isInt_re2:   0.8087

As you can see, the string method is the fastest. It is almost twice as fast as the regex method that avoids relying on any globals, and more than half again faster than the try:except method. The regex method that relies on some globals (or, well, module attributes) is a close second.

如您所见,string方法是最快的。它几乎是regex方法的两倍,该方法避免依赖任何全局变量,并且比try:except方法快了一半以上。依赖一些全局变量(或者模块属性)的regex方法紧随其后。

I think of these, my choice would be

我想到这些,我的选择就是

isInt = isInt_str

But eh.. this is copying and recopying and recopying the entire string! (And yet it's the fastest method!?) A C method could scan it Once Through, and be done. A C method that scans the string once through would be the Right Thing to do, I think? I guess there might be some string encoding issues to deal with.. Anyway, I'd try and work one out now, but I'm out of time for this. =( Maybe I'll come back to it later.

但是呃. .这是复制、重选和重选整个字符串!(但这是最快的方法!)一个C方法可以扫描它一次,然后完成。C方法扫描字符串一次,我认为是正确的做法?我想可能会有一些字符串编码问题需要处理。不管怎么说,我现在就试着解决这个问题,但是我没时间了。(也许我过会儿再谈。)

#4


19  

Use a regular expression:

使用一个正则表达式:

import re
def RepresentsInt(s):
    return re.match(r"[-+]?\d+$", s) is not None

If you must accept decimal fractions also:

如果你也必须接受十进制分数:

def RepresentsInt(s):
    return re.match(r"[-+]?\d+(\.0*)?$", s) is not None

For improved performance if you're doing this often, compile the regular expression only once using re.compile().

如果您经常这样做,那么要提高性能,只需使用re.compile()编译正则表达式一次。

#5


15  

The proper RegEx solution would combine the ideas of Greg Hewgill and Nowell, but not use a global variable. You can accomplish this by attaching an attribute to the method. Also, I know that it is frowned upon to put imports in a method, but what I'm going for is a "lazy module" effect like http://peak.telecommunity.com/DevCenter/Importing#lazy-imports

正确的RegEx解决方案将结合Greg Hewgill和Nowell的想法,但不使用全局变量。您可以通过将属性附加到方法来实现这一点。另外,我知道将导入放到一个方法中是不允许的,但是我想要的是一个“惰性模块”效果,比如http://peak.telecommunity.com/devcenter/imports #lazy-imports

edit: My favorite technique so far is to use exclusively methods of the String object.

编辑:目前为止我最喜欢的技术是只使用字符串对象的方法。

#!/usr/bin/env python

# Uses exclusively methods of the String object
def isInteger(i):
    i = str(i)
    return i=='0' or (i if i.find('..') > -1 else i.lstrip('-+').rstrip('0').rstrip('.')).isdigit()

# Uses re module for regex
def isIntegre(i):
    import re
    if not hasattr(isIntegre, '_re'):
        print("I compile only once. Remove this line when you are confident in that.")
        isIntegre._re = re.compile(r"[-+]?\d+(\.0*)?$")
    return isIntegre._re.match(str(i)) is not None

# When executed directly run Unit Tests
if __name__ == '__main__':
    for obj in [
                # integers
                0, 1, -1, 1.0, -1.0,
                '0', '0.','0.0', '1', '-1', '+1', '1.0', '-1.0', '+1.0',
                # non-integers
                1.1, -1.1, '1.1', '-1.1', '+1.1',
                '1.1.1', '1.1.0', '1.0.1', '1.0.0',
                '1.0.', '1..0', '1..',
                '0.0.', '0..0', '0..',
                'one', object(), (1,2,3), [1,2,3], {'one':'two'}
            ]:
        # Notice the integre uses 're' (intended to be humorous)
        integer = ('an integer' if isInteger(obj) else 'NOT an integer')
        integre = ('an integre' if isIntegre(obj) else 'NOT an integre')
        # Make strings look like strings in the output
        if isinstance(obj, str):
            obj = ("'%s'" % (obj,))
        print("%30s is %14s is %14s" % (obj, integer, integre))

And for the less adventurous members of the class, here is the output:

对于那些不太喜欢冒险的同学,这里是输出:

I compile only once. Remove this line when you are confident in that.
                             0 is     an integer is     an integre
                             1 is     an integer is     an integre
                            -1 is     an integer is     an integre
                           1.0 is     an integer is     an integre
                          -1.0 is     an integer is     an integre
                           '0' is     an integer is     an integre
                          '0.' is     an integer is     an integre
                         '0.0' is     an integer is     an integre
                           '1' is     an integer is     an integre
                          '-1' is     an integer is     an integre
                          '+1' is     an integer is     an integre
                         '1.0' is     an integer is     an integre
                        '-1.0' is     an integer is     an integre
                        '+1.0' is     an integer is     an integre
                           1.1 is NOT an integer is NOT an integre
                          -1.1 is NOT an integer is NOT an integre
                         '1.1' is NOT an integer is NOT an integre
                        '-1.1' is NOT an integer is NOT an integre
                        '+1.1' is NOT an integer is NOT an integre
                       '1.1.1' is NOT an integer is NOT an integre
                       '1.1.0' is NOT an integer is NOT an integre
                       '1.0.1' is NOT an integer is NOT an integre
                       '1.0.0' is NOT an integer is NOT an integre
                        '1.0.' is NOT an integer is NOT an integre
                        '1..0' is NOT an integer is NOT an integre
                         '1..' is NOT an integer is NOT an integre
                        '0.0.' is NOT an integer is NOT an integre
                        '0..0' is NOT an integer is NOT an integre
                         '0..' is NOT an integer is NOT an integre
                         'one' is NOT an integer is NOT an integre
<object object at 0x103b7d0a0> is NOT an integer is NOT an integre
                     (1, 2, 3) is NOT an integer is NOT an integre
                     [1, 2, 3] is NOT an integer is NOT an integre
                {'one': 'two'} is NOT an integer is NOT an integre

#6


3  

Greg Hewgill's approach was missing a few components: the leading "^" to only match the start of the string, and compiling the re beforehand. But this approach will allow you to avoid a try: exept:

格雷格Hewgill的方法是失踪几个组件:领先的“^”只匹配字符串的开始,事先和编译。但是这种方法可以避免尝试:exept:

import re
INT_RE = re.compile(r"^[-]?\d+$")
def RepresentsInt(s):
    return INT_RE.match(str(s)) is not None

I would be interested why you are trying to avoid try: except?

我很想知道你为什么要避免尝试:除了?

#7


3  

>>> "+7".lstrip("-+").isdigit()
True
>>> "-7".lstrip("-+").isdigit()
True
>>> "7".lstrip("-+").isdigit()
True
>>> "13.4".lstrip("-+").isdigit()
False

So your function would be:

所以你的函数是:

def is_int(val):
   return val[1].isdigit() and val.lstrip("-+").isdigit()

#8


2  

I think

我认为

s.startswith('-') and s[1:].isdigit()

would be better to rewrite to:

最好重写为:

s.replace('-', '').isdigit()

because s[1:] also creates a new string

因为s[1:]也创建了一个新字符串。

But much better solution is

但更好的解决办法是

s.lstrip('+-').isdigit()

#9


1  

This is probably the most straightforward and pythonic way to approach it in my opinion. I didn't see this solution and it's basically the same as the regex one, but without the regex.

在我看来,这可能是最直接、最直观的方法。我没有看到这个解决方案,它基本上和regex一样,但是没有regex。

def is_int(test):
    import string
    return not (set(test) - set(string.digits))

#10


1  

Here is a function that parses without raising errors. It handles obvious cases returns None on failure (handles up to 2000 '-/+' signs by default on CPython!):

这是一个没有增加错误的函数。它在失败时处理明显的情况没有返回任何(在CPython上默认处理多达2000个'-/+'符号!)

#!/usr/bin/env python

def get_int(number):
    splits = number.split('.')
    if len(splits) > 2:
        # too many splits
        return None
    if len(splits) == 2 and splits[1]:
        # handle decimal part recursively :-)
        if get_int(splits[1]) != 0:
            return None

    int_part = splits[0].lstrip("+")
    if int_part.startswith('-'):
        # handle minus sign recursively :-)
        return get_int(int_part[1:]) * -1
    # successful 'and' returns last truth-y value (cast is always valid)
    return int_part.isdigit() and int(int_part)

Some tests:

一些测试:

tests = ["0", "0.0", "0.1", "1", "1.1", "1.0", "-1", "-1.1", "-1.0", "-0", "--0", "---3", '.3', '--3.', "+13", "+-1.00", "--+123", "-0.000"]

for t in tests:
    print "get_int(%s) = %s" % (t, get_int(str(t)))

Results:

结果:

get_int(0) = 0
get_int(0.0) = 0
get_int(0.1) = None
get_int(1) = 1
get_int(1.1) = None
get_int(1.0) = 1
get_int(-1) = -1
get_int(-1.1) = None
get_int(-1.0) = -1
get_int(-0) = 0
get_int(--0) = 0
get_int(---3) = -3
get_int(.3) = None
get_int(--3.) = 3
get_int(+13) = 13
get_int(+-1.00) = -1
get_int(--+123) = 123
get_int(-0.000) = 0

For your needs you can use:

为满足您的需要,您可以使用:

def int_predicate(number):
     return get_int(number) is not None

#11


0  

I have one possibility that doesn't use int at all, and should not raise an exception unless the string does not represent a number

我有一种可能根本不使用int,除非字符串不表示数字,否则不应该引发异常。

float(number)==float(number)//1

It should work for any kind of string that float accepts, positive, negative, engineering notation...

它应该适用于任何浮动接受的,正的,负的,工程符号…

#12


0  

I really liked Shavais' post, but I added one more test case ( & the built in isdigit() function):

我非常喜欢Shavais的帖子,但是我又增加了一个测试用例(以及内置的isdigit()函数):

def isInt_loop(v):
    v = str(v).strip()
    # swapping '0123456789' for '9876543210' makes nominal difference (might have because '1' is toward the beginning of the string)
    numbers = '0123456789'
    for i in v:
        if i not in numbers:
            return False
    return True

def isInt_Digit(v):
    v = str(v).strip()
    return v.isdigit()

and it significantly consistently beats the times of the rest:

而且它显著地超过了其他时间:

timings..
isInt_try:   0.4628
isInt_str:   0.3556
isInt_re:    0.4889
isInt_re2:   0.2726
isInt_loop:   0.1842
isInt_Digit:   0.1577

using normal 2.7 python:

使用python 2.7正常:

$ python --version
Python 2.7.10

Both the two test cases I added (isInt_loop and isInt_digit) pass the exact same test cases (they both only accept unsigned integers), but I thought that people could be more clever with modifying the string implementation (isInt_loop) opposed to the built in isdigit() function, so I included it, even though there's a slight difference in execution time. (and both methods beat everything else by a lot, but don't handle the extra stuff: "./+/-" )

我添加了两个测试用例(isInt_loop和isInt_digit)通过相同的测试用例(他们都只接受无符号整数),但我认为人们会更加聪明与修改字符串实现(isInt_loop)反对建在isdigit()函数,所以我有,即使有细微差别,执行时间。(这两种方法都比其他任何方法都强,但不要处理额外的东西:“。/ + / -)

Also, I did find it interesting to note that the regex (isInt_re2 method) beat the string comparison in the same test that was performed by Shavais in 2012 (currently 2018). Maybe the regex libraries have been improved?

另外,我发现有趣的是,regex (isInt_re2方法)在2012年Shavais进行的相同测试(目前为2018年)中击败了字符串比较。也许regex库已经得到了改进?

#13


-4  

Uh.. Try this:

呃. .试试这个:

def int_check(a):
    if int(a) == a:
        return True
    else:
        return False

This works if you don't put a string that's not a number.

如果你不放一个不是数字的字符串,它就能工作。

And also (I forgot to put the number check part. ), there is a function checking if the string is a number or not. It is str.isdigit(). Here's an example:

还有(我忘了把数字检查部分放进去)有一个函数检查字符串是否是数字。这是str.isdigit()。这里有一个例子:

a = 2
a.isdigit()

If you call a.isdigit(), it will return True.

如果您调用a.isdigit(),它将返回True。


注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:http://www.itdaan.com/blog/2009/08/12/ba35659769c46a02076da9c108e76aa.html



 
© 2014-2019 ITdaan.com 粤ICP备14056181号