二、collections
collections是对Python现有的数据类型的补充,在使用collections中的对象要先导入import collections模块
1、Counter——计数器
1.1 Counter说明及定义
计数器是对字典的补充,继承自字典对象,也就是说除了具有字典的所有方法外,还有很多扩展的功能
定义Counter对象
Counter接受一个序列对象如列表、元祖、字符串等,返回成员出现的以成员为key出现次数value的字典(按照出现的次数的倒序排列)
>>> c = collections.Counter("adfawreqewradfa")>>> cCounter({ 'a': 4, 'd': 2, 'e': 2, 'r': 2, 'w': 2, 'f': 2, 'q': 1})>>> c2 = collections.Counter(['zhang', 'tom', 'peter', 'zhang'])>>> c2Counter({ 'zhang': 2, 'peter': 1, 'tom': 1})
1.2 Couter常用方法
1)most_common——返回前几个的元素和对应出现的次数(按照出现次数的倒序排列)
代码:
1 def most_common(self, n=None): 2 '''List the n most common elements and their counts from the most 3 common to the least. If n is None, then list all element counts. 4 5 >>> Counter('abcdeabcdabcaba').most_common(3) 6 [('a', 5), ('b', 4), ('c', 3)] 7 8 ''' 9 # Emulate Bag.sortedByCount from Smalltalk10 if n is None:11 return sorted(self.items(), key=_itemgetter(1), reverse=True)12 return _heapq.nlargest(n, self.items(), key=_itemgetter(1))
示例:
Counter({ 'a': 4, 'd': 2, 'e': 2, 'r': 2, 'w': 2, 'f': 2, 'q': 1})>>> c.most_common(3)[('a', 4), ('d', 2), ('e', 2)]>>> c.most_common(2)[('a', 4), ('d', 2)]
2)elements——返回所有元素,迭代器对象
代码:
1 def elements(self): 2 '''Iterator over elements repeating each as many times as its count. 3 4 >>> c = Counter('ABCABC') 5 >>> sorted(c.elements()) 6 ['A', 'A', 'B', 'B', 'C', 'C'] 7 8 # Knuth's example for prime factors of 1836: 2**2 * 3**3 * 17**1 9 >>> prime_factors = Counter({2: 2, 3: 3, 17: 1})10 >>> product = 111 >>> for factor in prime_factors.elements(): # loop over factors12 ... product *= factor # and multiply them13 >>> product14 183615 16 Note, if an element's count has been set to zero or is a negative17 number, elements() will ignore it.18 19 '''20 # Emulate Bag.do from Smalltalk and Multiset.begin from C++.21 return _chain.from_iterable(_starmap(_repeat, self.items()))22 23 # Override dict methods where necessary
示例:
>>> c = collections.Counter("adfawreqewradfa")>>> cCounter({ 'a': 4, 'd': 2, 'e': 2, 'r': 2, 'w': 2, 'f': 2, 'q': 1})>>> c.elements()>>> list(c.elements())['d', 'd', 'q', 'e', 'e', 'r', 'r', 'w', 'w', 'f', 'f', 'a', 'a', 'a', 'a']
注意:返回的是一个迭代器对象,可以通过内置方法将其转化为列表对象,也可以字节通过for in进行遍历
3)update——添加一个新的成员,如果存在计数器的值进行累加,如果不存在将新建一个成员
代码:
1 def update(*args, **kwds): 2 ''' 3 类似字典的update方法,添加一个成员的同时计数器会进行累加 4 Like dict.update() but add counts instead of replacing them. 5 6 Source can be an iterable, a dictionary, or another Counter instance. 7 8 >>> c = Counter('which') 9 >>> c.update('witch') # add elements from another iterable10 >>> d = Counter('watch')11 >>> c.update(d) # add elements from another counter12 >>> c['h'] # four 'h' in which, witch, and watch13 414 15 '''16 # The regular dict.update() operation makes no sense here because the17 # replace behavior results in the some of original untouched counts18 # being mixed-in with all of the other counts for a mismash that19 # doesn't have a straight-forward interpretation in most counting20 # contexts. Instead, we implement straight-addition. Both the inputs21 # and outputs are allowed to contain zero and negative counts.22 23 if not args:24 raise TypeError("descriptor 'update' of 'Counter' object "25 "needs an argument")26 self, *args = args27 if len(args) > 1:28 raise TypeError('expected at most 1 arguments, got %d' % len(args))29 iterable = args[0] if args else None30 if iterable is not None:31 if isinstance(iterable, Mapping):32 if self:33 self_get = self.get34 for elem, count in iterable.items():35 self[elem] = count + self_get(elem, 0)36 else:37 super(Counter, self).update(iterable) # fast path when counter is empty38 else:39 _count_elements(self, iterable)40 if kwds:41 self.update(kwds)
示例:
>>> c = collections.Counter(['zhang', 'peter', 'tom', 'zhang'])>>> cCounter({ 'zhang': 2, 'peter': 1, 'tom': 1})>>> c.update('peter')>>> c Counter({ 'zhang': 2, 'e': 2, 'peter': 1, 't': 1, 'r': 1, 'tom': 1, 'p': 1}) # 注意参数是一个序列对象,如果传的是一个字符串,字符串的每一个字符都会被当成一个元素>>> c = collections.Counter(['zhang', 'peter', 'tom', 'zhang'])>>> c.update(['zhang'])>>> cCounter({ 'zhang': 3, 'peter': 1, 'tom': 1})
4)subtract——减去一个成员,计数器减1
代码:
1 def subtract(*args, **kwds): 2 '''Like dict.update() but subtracts counts instead of replacing them. 3 Counts can be reduced below zero. Both the inputs and outputs are 4 allowed to contain zero and negative counts. 5 6 Source can be an iterable, a dictionary, or another Counter instance. 7 8 >>> c = Counter('which') 9 >>> c.subtract('witch') # subtract elements from another iterable10 >>> c.subtract(Counter('watch')) # subtract elements from another counter11 >>> c['h'] # 2 in which, minus 1 in witch, minus 1 in watch12 013 >>> c['w'] # 1 in which, minus 1 in witch, minus 1 in watch14 -115 16 '''17 if not args:18 raise TypeError("descriptor 'subtract' of 'Counter' object "19 "needs an argument")20 self, *args = args21 if len(args) > 1:22 raise TypeError('expected at most 1 arguments, got %d' % len(args))23 iterable = args[0] if args else None24 if iterable is not None:25 self_get = self.get26 if isinstance(iterable, Mapping):27 for elem, count in iterable.items():28 self[elem] = self_get(elem, 0) - count29 else:30 for elem in iterable:31 self[elem] = self_get(elem, 0) - 132 if kwds:33 self.subtract(kwds)
示例:
>>> c = collections.Counter(['zhang', 'peter', 'tom', 'zhang'])>>> cCounter({ 'zhang': 2, 'peter': 1, 'tom': 1})>>> c.subtract(['zhang'])>>> cCounter({ 'peter': 1, 'tom': 1, 'zhang': 1})>>> c.subtract(['zhang'])>>> c.subtract(['zhang'])>>> cCounter({ 'peter': 1, 'tom': 1, 'zhang': -1})
注意:如果成员已经不存在了或者说为0了,计数器会继续递减,也就是说计数器有0和负数的概念的,但是使用elements显示的时候却没有该成员,如果计时器是0或者负数能说明这个成员出现过而已,另外如果为负数的时候,添加成员,成员不会真的添加到elements显示的成员中,直到计数器大于0为止
>>> list(c.elements())['peter', 'tom']>>> c.update(['zhang'])>>> list(c.elements()) ['peter', 'tom']>>> cCounter({ 'peter': 1, 'tom': 1, 'zhang': 0})
2、OrderedDict——有序字典
2.1 有序字典说明及定义
我们知道字典的是无顺序的,orderedDict就是对字典的扩展,使其有序,并且根据添加顺序进行排序
>>> oc = collections.OrderedDict()
当然我们也可以通过一个现有的字典进行初始化一个有序字典
>>> old_dic = { 'a':1, 'b':2, 'c':3} >>> new_dic = collections.OrderedDict(old_dic)>>> new_dicOrderedDict([('b', 2), ('c', 3), ('a', 1)])
说明:由于old_dic是无序的,所以初始化的OrderedDict顺序并不是我们看到的定义old_dic时候的顺序,只是后面再添加成员的时候顺序是有保障的
>>> new_dic['d'] = 4>>> new_dic['e'] = 5>>> new_dicOrderedDict([('b', 2), ('c', 3), ('a', 1), ('d', 4), ('e', 5)])
2.2 常用方法
1)clear——清空字典
代码:
1 def clear(self): # real signature unknown; restored from __doc__2 """ 3 清空字典4 od.clear() -> None. Remove all items from od. """5 pass
示例:
>>> dic = collections.OrderedDict({ 'a':1, 'b':2, 'c':3}) >>> dicOrderedDict([('b', 2), ('c', 3), ('a', 1)])>>> dic.clear()>>> dicOrderedDict()
2)keys——返回所有key组成的迭代对象
代码:
1 def keys(self, *args, **kwargs): # real signature unknown2 pass
示例:
>>> dic = collections.OrderedDict({ 'a':1, 'b':2, 'c':3}) >>> dic.keys()KeysView(OrderedDict([('b', 2), ('c', 3), ('a', 1)]))
注意:返回的一个可迭代的对象,同样可以使用for in方法进行循环遍历,与原生字典不同的是有序字典返回的keys也是有序的
3)values——返回所有value组成的迭代对象
代码:
1 def values(self, *args, **kwargs): # real signature unknown2 pass
示例:
>>> dic = collections.OrderedDict({ 'a':1, 'b':2, 'c':3}) >>> dic.values()ValuesView(OrderedDict([('b', 2), ('c', 3), ('a', 1)]))
说明:同样是有序的
4)items——返回key和value组成的迭代对象
代码:
1 def items(self, *args, **kwargs): # real signature unknown2 pass
示例:
>>> dic.items()ItemsView(OrderedDict([('b', 2), ('c', 3), ('a', 1)]))
5)pop——删除指定key的元素
代码:
1 def pop(self, k, d=None): # real signature unknown; restored from __doc__ 2 """ 3 删除指定key的元素,并返回key所对应的值 4 k:要删除的元素的key 5 d:如果key不存在返回的默认值 6 od.pop(k[,d]) -> v, remove specified key and return the corresponding 7 value. If key is not found, d is returned if given, otherwise KeyError 8 is raised. 9 """10 pass
示例:
>>> dic = collections.OrderedDict({ 'a':1, 'b':2, 'c':3}) >>> dicOrderedDict([('b', 2), ('c', 3), ('a', 1)])>>> dic.pop('b')2>>> dicOrderedDict([('c', 3), ('a', 1)])>>> dic.pop('d', 10)10
6)popitem——删除末尾的元素
代码:
1 def popitem(self): # real signature unknown; restored from __doc__2 """3 删除末尾的元素,并返回删除的元素的key和value4 od.popitem() -> (k, v), return and remove a (key, value) pair.5 Pairs are returned in LIFO order if last is true or FIFO order if false.6 """7 pass
示例:
>>> dic = collections.OrderedDict({ 'a':1, 'b':2, 'c':3}) >>> dicOrderedDict([('b', 2), ('c', 3), ('a', 1)])>>> dic.popitem()('a', 1)
说明:与原生字典不同的是,由于字典是有序的,所以删除不是随机的,而是删除排在最后的
7)setdefault——设置默认值
代码:
1 def setdefault(self, k, d=None): # real signature unknown; restored from __doc__2 """ 3 设置某个键的默认值,使用get方法如果该键不存在返回的值4 od.setdefault(k[,d]) -> od.get(k,d), also set od[k]=d if k not in od """5 pass
示例:同原生字典
8)update——将另一个字典更新到当前字典
代码
1 def update(self, *args, **kwargs): # real signature unknown2 pass
示例:同原生字典,不同的是有序和无序
9)move_to_end——将一个存在的元素移动到字典的末尾
代码:
1 def move_to_end(self, *args, **kwargs): # real signature unknown2 """3 移动一个元素到字典的末尾,如果该元素不存在这回抛出KeyError异常4 Move an existing element to the end (or beginning if last==False).5 6 Raises KeyError if the element does not exist.7 When last=True, acts like a fast version of self[key]=self.pop(key).8 """9 pass
示例:
>>> dic = collections.OrderedDict({ 'a':1, 'b':2, 'c':3})>>> dicOrderedDict([('b', 2), ('c', 3), ('a', 1)])>>> dic.move_to_end('b')>>> dicOrderedDict([('c', 3), ('a', 1), ('b', 2)])
3、defaultdict——默认字典
defaultdict是对字典的扩展,它默认个给字典的值设置了一种默认的数据类型,其他的均与原生字典一样
>>> ddic = collections.defaultdict(list) # 定义的时候需要指定默认的数据类型,这里指定的是列表类型>>> ddic['k1'].append('a') # 尽管当前key还没有值,但是它默认已经是列表类型的类型,所以直接可以是用列表的append方法>>> ddicdefaultdict(, { 'k1': ['a']})
4、namedtuple——可命名元祖
可命名元祖是元祖的扩展,包含所有元祖的方法的同时可以给每个元祖的元素命名,访问的时候也不需要在通过索引进行访问,直接通过元素名即可访问
>>> MytupleClass = collections.namedtuple('MytupleClass',['x', 'y', 'z'])>>> mytup = MytupleClass(11,22,33)>>> mytup.x11>>> mytup.y22>>> mytup.z33
5、deque——双向队列
deque是一个线程安全的双向队列,类似列表,不同的是,deque是线程安全,并且是双向的也就是两边都可以进出
4.1 定义
d = collections.deque()
4.2 常用方法
1)append——从右边追加一个元素到队列的末尾
代码:
1 def append(self, *args, **kwargs): # real signature unknown2 """3 从右边追加一个元素到队列的末尾4 Add an element to the right side of the deque. """5 pass
示例:
>>> d = collections.deque([1, 2, 3])>>> ddeque([1, 2, 3])>>> d.append(4)>>> ddeque([1, 2, 3, 4])
2)appendleft——从左边追加一个元素到队列的末尾
代码:
1 def appendleft(self, *args, **kwargs): # real signature unknown2 """ 3 从左边追加一个元素到队列的末尾4 Add an element to the left side of the deque. """5 pass
示例:
>>> d = collections.deque([1, 2, 3])>>> ddeque([1, 2, 3])>>> d.appendleft(4)>>> ddeque([4, 1, 2, 3])
3)clear——清空队列
代码:
1 def clear(self, *args, **kwargs): # real signature unknown2 """ 3 清空队列4 Remove all elements from the deque. """5 pass
示例:
>>> d = collections.deque([1, 2, 3])>>> ddeque([1, 2, 3])>>> d.clear()>>> ddeque([])
4)count——返回某个成员重复出现的次数
代码:
def count(self, value): # real signature unknown; restored from __doc__ """ 返回某个元素出现的次数 D.count(value) -> integer -- return number of occurrences of value """ return 0
示例:
>>> d = collections.deque([1, 2, 3, 2])>>> d.count(2)2
5)extend——从队列右边扩展一个可迭代的对象
代码:
1 def extend(self, *args, **kwargs): # real signature unknown2 """ 3 从队列右边扩展一个可迭代的对象4 Extend the right side of the deque with elements from the iterable """5 pass
示例:
>>> d = collections.deque([1, 2, 3])>>> ddeque([1, 2, 3])>>> d.extend([4, 5])>>> ddeque([1, 2, 3, 4, 5])
6)extendleft——从队列左侧扩展一个可迭代的对象
代码:
1 def extendleft(self, *args, **kwargs): # real signature unknown2 """3 从队列左侧扩展一个可迭代对象4 Extend the left side of the deque with elements from the iterable """5 pass
示例:
>>> d = collections.deque([1, 2, 3])>>> ddeque([1, 2, 3])>>> d.extendleft([4, 5])>>> ddeque([5, 4, 1, 2, 3])
7)index——查找并返回索引
代码:
1 def index(self, value, start=None, stop=None): # real signature unknown; restored from __doc__ 2 """ 3 查找元素是否存在,如果不存在将会抛出ValueError异常,如果存在返回第一找到的索引位置 4 value:要查找的元素 5 start:查找的开始所以你能 6 stop:查找的结束索引 7 D.index(value, [start, [stop]]) -> integer -- return first index of value. 8 Raises ValueError if the value is not present. 9 """10 return 0
说明:使用方法同列表,需要说明的是虽然是双向列表,但索引还是从左到右编码的
8)insert——插入索引
还没有实现
>>> d = collections.deque([1, 2, 3])>>> d.insert(0, 4)Traceback (most recent call last): File "", line 1, in AttributeError: 'collections.deque' object has no attribute 'insert'
9)pop——从队列右侧末尾删除一个元素,并返回该元素
代码:
1 def pop(self, *args, **kwargs): # real signature unknown2 """ 3 从队列右侧删除一个元素,并返回该元素4 Remove and return the rightmost element. """5 pass
示例:
>>> d = collections.deque([1, 2, 3])>>> d.pop()3
10)popleft——从队列左侧删除一个元素,并返回该元素
代码:
1 def popleft(self, *args, **kwargs): # real signature unknown2 """ 3 从队列的左侧删除一个元素,并返回该元素4 Remove and return the leftmost element. """5 pass
示例:
>>> d = collections.deque([1, 2, 3])>>> d.popleft() 1
11)remove——删除一个元素
代码:
1 def remove(self, value): # real signature unknown; restored from __doc__2 """ 3 从队列左侧开始查找,并删除找到的第一个匹配的元素4 D.remove(value) -- remove first occurrence of value. """5 pass
示例:
>>> d = collections.deque([1, 2, 3, 2])>>> ddeque([1, 2, 3, 2])>>> d.remove(2) >>> ddeque([1, 3, 2])
12)reverse——翻转队列
代码:
1 def reverse(self): # real signature unknown; restored from __doc__2 """ 3 翻转队列4 D.reverse() -- reverse *IN PLACE* """5 pass
示例:
>>> d = collections.deque([1, 2, 3])>>> d.reverse()>>> ddeque([3, 2, 1])
13)rotate——旋转队列
双向队列的旋转可以理解为,双向队列的首位是相连的环,旋转就是元素移动了多少个位置,如下图所示,或者说从左边取出元素追加到右边,追加了多少次
代码:
1 def rotate(self, *args, **kwargs): # real signature unknown2 """ 3 队列旋转,默认移动1位4 Rotate the deque n steps to the right (default n=1). If n is negative, rotates left. """5 pass
示例:
>>> d = collections.deque([1, 2, 3, 4, 5])>>> ddeque([1, 2, 3, 4, 5])>>> d.rotate(2)>>> ddeque([4, 5, 1, 2, 3])