数字的位数求和

如果我想找到一个数字的位数之和,比如:

  • 输入: 932
  • 输出: 14,即 (9 + 3 + 2)

做这件事最快的方法是什么?

我本能地这样做了:

sum(int(digit) for digit in str(number))

我在网上找到了这个:

sum(map(int, str(number)))

哪种方法最适合用来提高速度,还有其他更快的方法吗?

264790 次浏览

Both lines you posted are fine, but you can do it purely in integers, and it will be the most efficient:

def sum_digits(n):
s = 0
while n:
s += n % 10
n //= 10
return s

or with divmod:

def sum_digits2(n):
s = 0
while n:
n, remainder = divmod(n, 10)
s += remainder
return s

Slightly faster is using a single assignment statement:

def sum_digits3(n):
r = 0
while n:
r, n = r + n % 10, n // 10
return r

> %timeit sum_digits(n)
1000000 loops, best of 3: 574 ns per loop


> %timeit sum_digits2(n)
1000000 loops, best of 3: 716 ns per loop


> %timeit sum_digits3(n)
1000000 loops, best of 3: 479 ns per loop


> %timeit sum(map(int, str(n)))
1000000 loops, best of 3: 1.42 us per loop


> %timeit sum([int(digit) for digit in str(n)])
100000 loops, best of 3: 1.52 us per loop


> %timeit sum(int(digit) for digit in str(n))
100000 loops, best of 3: 2.04 us per loop

This might help

def digit_sum(n):
num_str = str(n)
sum = 0
for i in range(0, len(num_str)):
sum += int(num_str[i])
return sum

If you want to keep summing the digits until you get a single-digit number (one of my favorite characteristics of numbers divisible by 9) you can do:

def digital_root(n):
x = sum(int(digit) for digit in str(n))
if x < 10:
return x
else:
return digital_root(x)

Which actually turns out to be pretty fast itself...

%timeit digital_root(12312658419614961365)


10000 loops, best of 3: 22.6 µs per loop

Doing some Codecademy challenges I resolved this like:

def digit_sum(n):
digits = []
nstr = str(n)
for x in nstr:
digits.append(int(x))
return sum(digits)

you can also try this with built_in_function called divmod() ;

number = int(input('enter any integer: = '))
sum = 0
while number!=0:
take = divmod(number, 10)
dig = take[1]
sum += dig
number = take[0]
print(sum)

you can take any number of digit

Found this on one of the problem solving challenge websites. Not mine, but it works.

num = 0            # replace 0 with whatever number you want to sum up
print(sum([int(k) for k in str(num)]))

Try this

    print(sum(list(map(int,input("Enter your number ")))))

Here is a solution without any loop or recursion but works for non-negative integers only (Python3):

def sum_digits(n):
if n > 0:
s = (n-1) // 9
return n-9*s
return 0

A base 10 number can be expressed as a series of the form

a × 10^p + b × 10^p-1 .. z × 10^0

so the sum of a number's digits is the sum of the coefficients of the terms.

Based on this information, the sum of the digits can be computed like this:

import math


def add_digits(n):
# Assume n >= 0, else we should take abs(n)
if 0 <= n < 10:
return n
r = 0
ndigits = int(math.log10(n))
for p in range(ndigits, -1, -1):
d, n = divmod(n, 10 ** p)
r += d
return r


This is effectively the reverse of the continuous division by 10 in the accepted answer. Given the extra computation in this function compared to the accepted answer, it's not surprising to find that this approach performs poorly in comparison: it's about 3.5 times slower, and about twice as slow as

sum(int(x) for x in str(n))

Whether it's faster to work with math or strings here depends on the size of the input number.

For small numbers (fewer than 20 digits in length), use division and modulus:

def sum_digits_math(n):
r = 0
while n:
r, n = r + n % 10, n // 10
return r

For large numbers (greater than 30 digits in length), use the string domain:

def sum_digits_str_fast(n):
d = str(n)
return sum(int(s) * d.count(s) for s in "123456789")

There is also a narrow window for numbers between 20 and 30 digits in length where sum(map(int, str(n))) is fastest. That is the purple line in the graph shown below (click here to zoom in).

profile

The performance profile for using math scales poorly as the input number is bigger, but each approach working in string domain appears to scale linearly in the length of the input. The code that was used to generate these graphs is here, I'm using CPython 3.10.6 on macOS.

Why is the highest rated answer 3.70x slower than this ?

% echo; ( time (nice echo 33785139853861968123689586196851968365819658395186596815968159826259681256852169852986 \
| mawk2 'gsub(//,($_)($_)($_))+gsub(//,($_))+1' | pvE0 \
| mawk2 '
   

function __(_,___,____,_____) {


____=gsub("[^1-9]+","",_)~""
___=10
while((+____<--___) && _) {
_____+=___*gsub(___,"",_)
}
return _____+length(_) }


BEGIN { FS=OFS=ORS
RS="^$"
} END {
print __($!_) }' )| pvE9 ) | gcat -n | lgp3 ;


in0:  173MiB 0:00:00 [1.69GiB/s] [1.69GiB/s] [<=>                                            ]
out9: 11.0 B 0:00:09 [1.15 B/s] [1.15 B/s] [<=>                                               ]
in0:  484MiB 0:00:00 [2.29GiB/s] [2.29GiB/s] [  <=>                                          ]
( nice echo  | mawk2 'gsub(//,($_)($_)($_))+gsub(//,($_))+1' | pvE 0.1 in0 | )


8.52s user 1.10s system 100% cpu 9.576 total
1  2822068024






% echo; ( time ( nice echo 33785139853861968123689586196851968365819658395186596815968159826259681256852169852986 \
\
| mawk2 'gsub(//,($_)($_)($_))+gsub(//,($_))+1' | pvE0 \
|  gtr -d '\n' \
\
|  python3 -c 'import math, os, sys;


[ print(sum(int(digit) for digit in str(ln)), \
end="\n") \
\
for ln in sys.stdin ]' )| pvE9 ) | gcat -n | lgp3 ;




in0:  484MiB 0:00:00 [ 958MiB/s] [ 958MiB/s] [     <=>                                       ]
out9: 11.0 B 0:00:35 [ 317miB/s] [ 317miB/s] [<=>                                             ]
( nice echo  | mawk2 'gsub(//,($_)($_)($_))+gsub(//,($_))+1' | pvE 0.1 in0 | )


35.22s user 0.62s system 101% cpu 35.447 total
1  2822068024

And that's being a bit generous already. On this large synthetically created test case of 2.82 GB, it's 19.2x slower.

 % echo; ( time ( pvE0 < testcases_more108.txt  |  mawk2 'function __(_,___,____,_____) { ____=gsub("[^1-9]+","",_)~"";___=10; while((+____<--___) && _) { _____+=___*gsub(___,"",_) }; return _____+length(_) } BEGIN { FS=RS="^$"; CONVFMT=OFMT="%.20g" } END { print __($_) }'  ) | pvE9 ) |gcat -n | ggXy3  | lgp3;


in0:  284MiB 0:00:00 [2.77GiB/s] [2.77GiB/s] [=>                             ]  9% ETA 0:00:00
out9: 11.0 B 0:00:11 [1016miB/s] [1016miB/s] [<=>                                             ]
in0: 2.82GiB 0:00:00 [2.93GiB/s] [2.93GiB/s] [=============================>] 100%
( pvE 0.1 in0 < testcases_more108.txt | mawk2 ; )


8.75s user 2.36s system 100% cpu 11.100 total
1  3031397722


% echo; ( time ( pvE0 < testcases_more108.txt  | gtr -d '\n' |  python3 -c 'import sys; [ print(sum(int(_) for _ in str(__))) for __ in sys.stdin ]' ) | pvE9 ) |gcat -n | ggXy3  | lgp3;




in0: 2.82GiB 0:00:02 [1.03GiB/s] [1.03GiB/s] [=============================>] 100%
out9: 11.0 B 0:03:32 [53.0miB/s] [53.0miB/s] [<=>                                             ]
( pvE 0.1 in0 < testcases_more108.txt | gtr -d '\n' | python3 -c ; )


211.47s user 3.02s system 100% cpu 3:32.69 total
1  3031397722

—————————————————————

UPDATE : native python3 code of that concept - even with my horrific python skills, i'm seeing a 4x speedup :

% echo; ( time ( pvE0 < testcases_more108.txt  \
\
|python3 -c 'import re, sys;


print(sum([ sum(int(_)*re.subn(_,"",__)[1]


for _ in [r"1",r"2", r"3",r"4",
r"5",r"6",r"7",r"8",r"9"])


for __ in sys.stdin ]))' |pvE9))|gcat -n| ggXy3|lgp3


in0: 1.88MiB 0:00:00 [18.4MiB/s] [18.4MiB/s] [>                              ]  0% ETA 0:00:00
out9: 0.00 B 0:00:51 [0.00 B/s] [0.00 B/s] [<=>                                               ]
in0: 2.82GiB 0:00:51 [56.6MiB/s] [56.6MiB/s] [=============================>] 100%
out9: 11.0 B 0:00:51 [ 219miB/s] [ 219miB/s] [<=>                                             ]


( pvE 0.1 in0 < testcases_more108.txt | python3 -c  | pvE 0.1 out9; )






48.07s user 3.57s system 100% cpu 51.278 total
1  3031397722

Even the smaller test case managed a 1.42x speed up :

 echo; ( time (nice echo 33785139853861968123689586196851968365819658395186596815968159826259681256852169852986 \
| mawk2 'gsub(//,($_)($_)$_)+gsub(//,$_)+1' ORS='' | pvE0 | python3 -c 'import re, sys; print(sum([ sum(int(_)*re.subn(_,"",__)[1] for _ in [r"1",r"2", r"3",r"4",r"5",r"6",r"7",r"8",r"9"]) for __ in sys.stdin ]))'  | pvE9  ))  |gcat -n | ggXy3 | lgp3




in0:  484MiB 0:00:00 [2.02GiB/s] [2.02GiB/s] [  <=>                                          ]
out9: 11.0 B 0:00:24 [ 451miB/s] [ 451miB/s] [<=>                                             ]
( nice echo  | mawk2 'gsub(//,($_)($_)$_)+gsub(//,$_)+1' ORS='' | pvE 0.1 in0)


20.04s user 5.10s system 100% cpu 24.988 total
1    2822068024