Why is xrange() faster than range()?

Apparently xrange is faster but I have no idea why it's faster (and no proof besides the anecdotal so far that it is faster) or what besides that is different about

for i in range(0, 20): for i in xrange(0, 20): 

Replay

range creates a list, so if you do range(1, 10000000) it creates a list in memory with 9999999 elements.

xrange is a sequence object that evaluates lazily.

range creates a list, so if you do range(1, 10000000) it creates a list in memory with 10000000 elements.

xrange is a generator, so it is a sequence object is a that evaluates lazily.

This is true, but in Python 3, range will be implemented by the Python 2 xrange(). If you need to actually generate the list, you will need to do:

list(range(1,100))

Remember, use the timeit module to test which of small snipps of code is faster!

$ python -m timeit 'for i in range(1000000):' ' pass'
10 loops, best of 3: 90.5 msec per loop
$ python -m timeit 'for i in xrange(1000000):' ' pass'
10 loops, best of 3: 51.1 msec per loop

Personally, I always use range(), unless I were dealing with really huge lists -- as you can see, time-wise, for a list of a million entries, the extra overhead is only 0.04 seconds. And as Corey points out, in Python 3.0 xrange will go away and range will give you nice iterator behaviour anyway.

xrange only stores the range params and generates the numbers on demand. However the C implementation of Python currently restricts its args to C longs:

xrange(2**32-1, 2**32+1)  # When long is 32 bits, OverflowError: Python int too large to convert to C long
range(2**32-1, 2**32+1)   # OK --> [4294967295L, 4294967296L]

Note that in Python 3.0 there is only range and it behaves like the 2.x xrange but without the limitations on minimum and maximum end points.

xrange returns an iterator and only keeps one number in memory at a time. range keeps the entire list of numbers in memory.

Do spend some time with the Library Reference. The more familiar you are with it, the faster you can find answers to questions like this. Especially important are the first few chapters about builtin objects and types.

The advantage of the xrange type is that an xrange object will always take the same amount of memory, no matter the size of the range it represents. There are no consistent performance advantages.

Another way to find quick information about a Python construct is the docstring and the help-function:

print xrange.__doc__ # def doc(x): print x.__doc__ is super useful
help(xrange)

range creates a list, so if you do range(1, 10000000) it creates a list in memory with 10000000 elements. xrange is a generator, so it evaluates lazily.

This brings you two advantages:

  1. You can iterate longer lists without getting a MemoryError.
  2. As it resolves each number lazily, if you stop iteration early, you won't waste time creating the whole list.

It is for optimization reasons.

range() will create a list of values from start to end (0 .. 20 in your example). This will become an expensive operation on very large ranges.

xrange() on the other hand is much more optimised. it will only compute the next value when needed (via an xrange sequence object) and does not create a list of all values like range() does.

I am shocked nobody read doc

This function is very similar to range(), but returns an xrange object instead of a list. This is an opaque sequence type which yields the same values as the corresponding list, without actually storing them all simultaneously. The advantage of xrange() over range() is minimal (since xrange() still has to create the values when asked for them) except when a very large range is used on a memory-starved machine or when all of the range’s elements are never used (such as when the loop is usually terminated with break).

range(): range(1, 10) returns a list from 1 to 10 numbers & hold whole list in memory.

xrange(): Like range(), but instead of returning a list, returns an object that generates the numbers in the range on demand. For looping, this is lightly faster than range() and more memory efficient. xrange() object like an iterator and generates the numbers on demand.(Lazy Evaluation)

In [1]: range(1,10)

Out[1]: [1, 2, 3, 4, 5, 6, 7, 8, 9]

In [2]: xrange(10)

Out[2]: xrange(10)

In [3]: print xrange.__doc__

xrange([start,] stop[, step]) -> xrange object

range generates the entire list and returns it. xrange does not -- it generates the numbers in the list on demand.

xrange uses an iterator (generates values on the fly), range returns a list.

When testing range against xrange in a loop (I know I should use timeit, but this was swiftly hacked up from memory using a simple list comprehension example) I found the following:

import time

for x in range(1, 10):

    t = time.time()
    [v*10 for v in range(1, 10000)]
    print "range:  %.4f" % ((time.time()-t)*100)

    t = time.time()
    [v*10 for v in xrange(1, 10000)]
    print "xrange: %.4f" % ((time.time()-t)*100)

which gives:

$python range_tests.py
range:  0.4273
xrange: 0.3733
range:  0.3881
xrange: 0.3507
range:  0.3712
xrange: 0.3565
range:  0.4031
xrange: 0.3558
range:  0.3714
xrange: 0.3520
range:  0.3834
xrange: 0.3546
range:  0.3717
xrange: 0.3511
range:  0.3745
xrange: 0.3523
range:  0.3858
xrange: 0.3997 <- garbage collection?

Or, using xrange in the for loop:

range:  0.4172
xrange: 0.3701
range:  0.3840
xrange: 0.3547
range:  0.3830
xrange: 0.3862 <- garbage collection?
range:  0.4019
xrange: 0.3532
range:  0.3738
xrange: 0.3726
range:  0.3762
xrange: 0.3533
range:  0.3710
xrange: 0.3509
range:  0.3738
xrange: 0.3512
range:  0.3703
xrange: 0.3509

Is my snippet testing properly? Any comments on the slower instance of xrange? Or a better example :-)

Some of the other answers mention that Python 3 eliminated 2.x's range and renamed 2.x's xrange to range. However, unless you're using 3.0 or 3.1 (which nobody should be), it's actually a somewhat different type.

As the 3.1 docs say:

Range objects have very little behavior: they only support indexing, iteration, and the len function.

However, in 3.2+, range is a full sequence—it supports extended slices, and all of the methods of collections.abc.Sequence with the same semantics as a list.*

And, at least in CPython and PyPy (the only two 3.2+ implementations that currently exist), it also has constant-time implementations of the index and count methods and the in operator (as long as you only pass it integers). This means writing 123456 in r is reasonable in 3.2+, while in 2.7 or 3.1 it would be a horrible idea.



* The fact that issubclass(xrange, collections.Sequence) returns True in 2.6-2.7 and 3.0-3.1 is a bug that was fixed in 3.2 and not backported.

xrange() and range() in python works similarly as for the user , but the difference comes when we are talking about how the memory is allocated in using both the function.

When we are using range() we allocate memory for all the variables it is generating, so it is not recommended to use with larger no. of variables to be generated.

xrange() on the other hand generate only a particular value at a time and can only be used with the for loop to print all the values required.

In python 2.x

range(x) returns a list, that is created in memory with x elements.

>>> a = range(5)
>>> a
[0, 1, 2, 3, 4]

xrange(x) returns an xrange object which is a generator obj which generates the numbers on demand. they are computed during for-loop(Lazy Evaluation).

For looping, this is slightly faster than range() and more memory efficient.

>>> b = xrange(5)
>>> b
xrange(5)

On a requirement for scanning/printing of 0-N items , range and xrange works as follows.

range() - creates a new list in the memory and takes the whole 0 to N items(totally N+1) and prints them. xrange() - creates a iterator instance that scans through the items and keeps only the current encountered item into the memory , hence utilising same amount of memory all the time.

In case the required element is somewhat at the beginning of the list only then it saves a good amount of time and memory.

What?
range returns a static list at runtime.
xrange returns an object (which acts like a generator, although it's certainly not one) from which values are generated as and when required.

When to use which?

  • Use xrange if you want to generate a list for a gigantic range, say 1 billion, especially when you have a "memory sensitive system" like a cell phone.
  • Use range if you want to iterate over the list several times.

PS: Python 3.x's range function == Python 2.x's xrange function.

The difference decreases for smaller arguments to range(..) / xrange(..):

$ python -m timeit "for i in xrange(10111):" " for k in range(100):" "  pass"
10 loops, best of 3: 59.4 msec per loop

$ python -m timeit "for i in xrange(10111):" " for k in xrange(100):" "  pass"
10 loops, best of 3: 46.9 msec per loop

In this case xrange(100) is only about 20% more efficient.

Read the following post for the comparison between range and xrange with graphical analysis.

Python range Vs xrange

See this post to find difference between range and xrange:

To quote:

range returns exactly what you think: a list of consecutive integers, of a defined length beginning with 0. xrange, however, returns an "xrange object", which acts a great deal like an iterator

Range returns a list while xrange returns an xrange object which takes the same memory irrespective of the range size,as in this case,only one element is generated and available per iteration whereas in case of using range, all the elements are generated at once and are available in the memory.

Category: python Time: 2008-09-18 Views: 0

Related post

iOS development

Android development

Python development

JAVA development

Development language

PHP development

Ruby development

search

Front-end development

Database

development tools

Open Platform

Javascript development

.NET development

cloud computing

server

Copyright (C) avrocks.com, All Rights Reserved.

processed in 0.113 (s). 12 q(s)