量化交易吧 / 源码分享帖子：3383531 新帖：0

Query对象的简单使用教程

耶伦发表于：5 月 9 日 17：49回复(1)

大家刚接触平台可能对于一些数据的提取方式不甚明了,尤其是对query对象的使用。可能将简单问题复杂化,以下整理了大家使用过程中一般能用到的操作方式以及一些常用的数据获取方式。如果大家对于query对象的使用或者某些数据的提取存在疑问,可以在此贴下留言讨论。

其实对数据的查询过程,filter是最重要的一步,filter可以对多个条件进行过滤,中间用,隔开。
比如查询某只股票某一个时间段,满足特定值的数据,可以用到 in_ ,and_,> , < , == ,!=等符号进行基本的查询

涉及到使用数据库操作的数据有:

get_fundamentals,get_fundamentals_continuously (股票单季度财务/估值数据)
finance (股票数据,基金数据,舆情数据等)
opt (期权数据)
macro (宏观数据)
jy (聚源数据)

基本的查询方式

query() 填写需要查询的对象,可以是整张表,也可以是表中的多个字段或计算出的结果
filter 填写过滤条件,多个过滤条件可以用逗号隔开,或者用and_,or_这样的语法
order_by() 填写排序条件

.desc() 降序排列
.asc() 升序排列

limit() 限制返回的个数
group_by() 分组统计

对于数据的查询过滤,一般通过filter函数进行实现,filter可以对多个条件进行过滤,中间用,隔开。
比如查询某只股票某一个时间段,满足特定值的数据,可以用到 in_ ,and_,> , < , == ,!=等符号进行基本的查询

涉及到使用数据库操作的数据有:

get_fundamentals (股票单季度财务数据)
finance (股票数据,基金数据等)
opt (期权数据)
macro (宏观数据)
jy (聚源数据)

基本的查询方式¶

query() 填写需要查询的对象,可以是整张表,也可以是表中的多个字段或计算出的结果
filter 填写过滤条件,多个过滤条件可以用逗号隔开,或者用and,or这样的语法
order_by 填写排序条件

.desc() 降序排列
.asc() 升序排列

limit 限制返回的个数
order_by 分组统计

# 查询 000001 2015 - 2017 年的整张合并利润表,且满足 net_profit(净利润) < 0 的条件from jqdata import financeq = query(finance.STK_INCOME_STATEMENT).filter(finance.STK_INCOME_STATEMENT.code=='000783.XSHE',              #选定股票  000783.XSHEfinance.STK_INCOME_STATEMENT.end_date > '2005-01-01',          #指定查询时间段大于2005年1月1日finance.STK_INCOME_STATEMENT.end_date < '2018-01-01',          #指定查询时间段小于2018年1月1日finance.STK_INCOME_STATEMENT.net_profit <0,                    #指定查询到的数据中net_profit为负finance.STK_INCOME_STATEMENT.report_type == 0,                 #指定报告期类型为本期).order_by(finance.STK_INCOME_STATEMENT.end_date.desc()  ).limit(5)   #根据end_date降序排序,并返回前5条数据 finance.run_query(q)

.dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; }

	id	company_id	company_name	code	a_code	b_code	h_code	pub_date	start_date	end_date	...	commission_income	interest_expense	commission_expense	refunded_premiums	net_pay_insurance_claims	withdraw_insurance_contract_reserve	policy_dividend_payout	reinsurance_cost	non_current_asset_disposed	other_earnings
0	257078	430000783	长江证券股份有限公司	000783.XSHE	000783	None	None	2007-10-22	2007-01-01	2007-09-30	...	None	None	None	None	None	None	None	None	None	None
1	257070	430000783	长江证券股份有限公司	000783.XSHE	000783	None	None	2006-10-25	2006-01-01	2006-09-30	...	None	None	None	None	None	None	None	None	None	None
2	257068	430000783	长江证券股份有限公司	000783.XSHE	000783	None	None	2006-08-17	2006-01-01	2006-06-30	...	None	None	None	None	None	None	None	None	None	None
3	257066	430000783	长江证券股份有限公司	000783.XSHE	000783	None	None	2006-04-29	2006-01-01	2006-03-31	...	None	None	None	None	None	None	None	None	None	None
4	257064	430000783	长江证券股份有限公司	000783.XSHE	000783	None	None	2006-04-29	2005-01-01	2005-12-31	...	None	None	None	None	None	None	None	None	None	None

5 rows × 60 columns

get_fundamentals 获取某个/多个季度/年度的数据(指定参数statDate)¶

# 查询平安银行2014年3-6月份的单季度报表q = query(  income.statDate,  income.code,  income.basic_eps,  cash_flow.goods_sale_and_service_render_cash
  ).filter(  income.code == '000001.XSHE',
  )ret = get_fundamentals(q, statDate='2014q2')ret

.dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; }

	statDate	code	basic_eps	goods_sale_and_service_render_cash
0	2014-06-30	000001.XSHE	0.35	NaN

# 查询平安银行2014年的年报q = query(  income.statDate,  income.code,  income.basic_eps,  cash_flow.goods_sale_and_service_render_cash
  ).filter(  income.code == '000001.XSHE',
  )ret = get_fundamentals(q, statDate='2014')ret

.dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; }

	statDate	code	basic_eps	goods_sale_and_service_render_cash
0	2014-12-31	000001.XSHE	1.73	NaN

# 查询平安银行2014年四个季度的季报, 放到数组中并拼接为dataframeq = query(  income.statDate,  income.code,  income.basic_eps,  balance.cash_equivalents,  cash_flow.goods_sale_and_service_render_cash
  ).filter(  income.code == '000001.XSHE',
  )rets = [get_fundamentals(q, statDate='2014q'+str(i)) for i in range(1, 5)]import pandas as pdpd.concat(rets)

.dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; }

statDate	code	basic_eps	cash_equivalents	goods_sale_and_service_render_cash
2014-03-31	000001.XSHE	0.53	2.581100e+11	NaN
2014-06-30	000001.XSHE	0.35	2.596040e+11	NaN
2014-09-30	000001.XSHE	0.49	2.773250e+11	NaN
2014-12-31	000001.XSHE	0.36	3.062980e+11	NaN

in_ 判断某个字段的值是否在列表之中(一般用于查询多个标的)¶

stocks = ['000001.XSHE','600741.XSHG','600507.XSHG']q = query(balance.code,balance.pubDate,balance.statDate,    # 指定返回的字段只包括code,pubDate,statDate,total_assets及total_sheet_owner_equities  balance.total_assets,balance.total_sheet_owner_equities).filter( balance.code.in_(stocks)   #指定查询到的数据只包括code在 stocks中的数据)get_fundamentals(q,date='2018-01-05')    #查询单季度数据中在2018-01-05之前发布的数据,没有未来函数

.dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; }

	code	pubDate	statDate	total_assets	total_sheet_owner_equities
0	000001.XSHE	2017-10-21	2017-09-30	3.137481e+12	3.137481e+12
1	600507.XSHG	2017-10-24	2017-09-30	1.041599e+10	1.041599e+10
2	600741.XSHG	2017-10-27	2017-09-30	1.153612e+11	1.153612e+11

distinct 去重,用于查看数据库中某个字段都存在哪些值¶

#  查看十大流通股东中都有哪些类别from jqdata import financeq = query( finance.STK_SHAREHOLDER_FLOATING_TOP10.shareholder_class_id.distinct(),  #提取ID不同的数据finance.STK_SHAREHOLDER_FLOATING_TOP10.shareholder_class, ).order_by(finance.STK_SHAREHOLDER_FLOATING_TOP10.change_reason_id)  df = finance.run_query(q)df.tail()

	anon_1	shareholder_class
23	307012	银行
24	307024	银行和上市公司
25	307025	信托投资管理公司和上市公司
26	307024	上市公司和银行
27	307027	银行和QFII

~ 反向查询符号¶

q = query( finance.STK_SHAREHOLDER_FLOATING_TOP10.shareholder_class_id.distinct(),finance.STK_SHAREHOLDER_FLOATING_TOP10.shareholder_class, ).filter(~finance.STK_SHAREHOLDER_FLOATING_TOP10.shareholder_class_id.in_(['307007','307001']),#等同于notin_)finance.run_query(q)

.dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; }

	anon_1	shareholder_class
0	307002	券商
1	307003	证券投资基金
2	307004	社保基金
3	307005	企业年金
4	307006	保险公司
5	307008	基金管理公司
6	307009	信托投资管理公司
7	307011	上市公司
8	307012	银行
9	307013	风险投资
10	307014	保险投资组合
11	307015	券商资产管理计划
12	307016	信托资产管理计划
13	307017	基金资产管理计划
14	307018	资产管理公司资产管理计划
15	307019	国有资产经营公司
16	307020	期货资产管理计划
17	307021	地方国资委
18	307022	券商和上市公司
19	307023	保险公司和上市公司
20	307024	银行和上市公司
21	307024	上市公司和银行
22	307025	信托投资管理公司和上市公司
23	307027	银行和QFII
24	307099	其他机构

运算和命名(label)¶

# label的作用是命名获得数据的标签,一般用于直接运算后的重命名q = query(indicator.code.label('股票代码'),indicator.operating_profit.label('get_operating_profit'), (income.total_operating_revenue - income.total_operating_cost).label('my_operating_profit') ).filter(indicator.code=='600507.XSHG')get_fundamentals(q)

.dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; }

	股票代码	get_operating_profit	my_operating_profit
0	600507.XSHG	1.329762e+09	1.329762e+09

or_ 或查询¶

from sqlalchemy.sql.expression import or_get_fundamentals(query(  valuation.code  ).filter(  or_(valuation.market_cap < 10,valuation.pe_ratio > 1000))).tail()  # 查询当天总市值小于1000亿或pe大于10亿的所有股票

.dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; }

	code
22	600506.XSHG
23	600520.XSHG
24	600766.XSHG
25	600882.XSHG
26	600890.XSHG

contains/like/ilike 数据库中的字符串模糊匹配¶

% 百分号通配符: 表示任何字符出现任意次数(可以是0次).
_ 下划线通配符:表示只能匹配单个字符,不能多也不能少,就是一个字符.

from jqdata import finance# 获取000001.XSHG每一年的年报   df = finance.run_query(query(finance.FINANCE_INCOME_STATEMENT   ).filter(finance.FINANCE_INCOME_STATEMENT.code.in_(['000001.XSHE']),#                 finance.FINANCE_INCOME_STATEMENT.end_date.like('%-12-31%'),    #sqlalchemy的版本,数据库的构建等问题可能导致查询报错,可以换以下两种方法尝试#               finance.FINANCE_INCOME_STATEMENT.end_date.contains('12-31') ,      finance.FINANCE_INCOME_STATEMENT.end_date.ilike('_____12-31'),finance.FINANCE_INCOME_STATEMENT.report_type==0   ))df.tail()

.dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; }

	id	company_id	company_name	code	a_code	b_code	h_code	pub_date	start_date	end_date	...	net_profit	np_parent_company_owners	minority_profit	eps	basic_eps	diluted_eps	other_composite_income	total_composite_income	ci_parent_company_owners	ci_minority_owners
20	2535	430000001	平安银行股份有限公司	000001.XSHE	000001	None	None	2014-03-07	2013-01-01	2013-12-31	...	1.523100e+10	1.523100e+10	NaN	None	1.86	1.86	-1.811000e+09	1.342000e+10	1.342000e+10	NaN
21	2543	430000001	平安银行股份有限公司	000001.XSHE	000001	None	None	2015-03-13	2014-01-01	2014-12-31	...	1.980200e+10	1.980200e+10	NaN	None	1.73	1.73	5.860000e+08	2.038800e+10	NaN	NaN
22	2551	430000001	平安银行股份有限公司	000001.XSHE	000001	None	None	2016-03-10	2015-01-01	2015-12-31	...	2.186500e+10	2.186500e+10	NaN	None	1.56	1.56	NaN	2.259900e+10	NaN	NaN
23	2559	430000001	平安银行股份有限公司	000001.XSHE	000001	None	None	2017-03-17	2016-01-01	2016-12-31	...	2.259900e+10	2.259900e+10	NaN	None	1.32	1.32	3.080000e+08	2.290700e+10	2.290700e+10	NaN
24	2564	430000001	平安银行股份有限公司	000001.XSHE	000001	None	None	2018-03-15	2017-01-01	2017-12-31	...	2.318900e+10	2.318900e+10	NaN	None	1.30	1.30	2.810000e+08	2.347000e+10	2.347000e+10	NaN

5 rows × 66 columns

简化计算的方法(sqlalchemy.sql.func) 和 group_by¶

func中的函数可以完成常用的一些计算,配合group_by完成简单的统计
下边以简单计算股票池/全市场每天的的等权重pe,pb为例
(当然,这个计算中一些细节并没有考虑到,实际还是以官方数据为准,可以通过聚源数据获取,具体下边会有演示)

from sqlalchemy.sql import funcfrom jqdata import financestocks = get_index_stocks('000016.XSHG')q = query(valuation.day,#           func.count('*'),       # 信息条数(也就是股票数量)  (func.count()/func.sum(1/valuation.pe_ratio)).label('*g_pe'),  # 等权重pe  (func.count()/func.sum(1/valuation.pb_ratio)).label('*g_pb'),  # 等权重pb ).group_by(valuation.day   ).filter(valuation.pe_ratio>0,   #排除亏损的valuation.pb_ratio>0,   valuation.day>'2009-12-01',  #设置一个时间valuation.code.in_(stocks)   #设置股票池,注释即为全市场)df = finance.run_query(q)df.set_index('day',inplace=True)df.tail()

.dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; }

	*g_pe	*g_pb
day
2019-01-16	10.081409	1.295044
2019-01-17	10.036275	1.289927
2019-01-18	10.201209	1.309428
2019-01-21	10.262107	1.314957
2019-01-22	10.262107	1.314957

更多关于query的可导入函数,可以使用以下方法查看,配合官网文档使用:

import sqlalchemy
dir(sqlalchemy.sql.expression)

其他常用数据获取方式整理¶

获取旧的申万指数列表和新的申万指数列表(14年有过改动)¶

from jqdata import *def get_sw_quote(day=None):'''获取指定日期的申万指数列表'''day = get_trade_days(end_date=day,count=1)[-1]df = jy.run_query(query( jy.QT_SYWGIndexQuote.InnerCode.distinct().label('InnerCode')).filter(jy.QT_SYWGIndexQuote.TradingDay==day,))code_df = jy.run_query(query( jy.SecuMain.SecuCode,jy.SecuMain.ChiName).filter(jy.SecuMain.InnerCode.in_(df.InnerCode)))return code_dfdf = get_sw_quote('2013-02-05')df.tail()

.dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; }

	SecuCode	ChiName
466	852041	申银万国指数-专业连锁
467	852051	申银万国指数-一般物业经营
468	852052	申银万国指数-专业市场
469	852225	申银万国指数-软件开发
470	852226	申银万国指数-IT服务

查询申万行情( 包含行业pe/pb)¶

from jqdata import jyfrom jqdata import *import pandas as pd#注意申万指数在2014年有一次大改,聚源使用的是为改变之前的代码,官网包含更改前和更改后的代码,如果遇到找不到的标的可以根据需求自行查找#如801124 >>801121食品加工IIdef get_sw_quote(code,end_date=None,count=None,start_date=None):'''获取申万指数行情,返回panel结构'''if isinstance(code,str):code=[code]days = get_trade_days(start_date,end_date,count)code_df = jy.run_query(query( jy.SecuMain.InnerCode,jy.SecuMain.SecuCode,jy.SecuMain.ChiName).filter(jy.SecuMain.SecuCode.in_(code)))df = jy.run_query(query( jy.QT_SYWGIndexQuote).filter(jy.QT_SYWGIndexQuote.InnerCode.in_(code_df.InnerCode),jy.QT_SYWGIndexQuote.TradingDay.in_(days),))df2  = pd.merge(code_df, df, on='InnerCode').set_index(['TradingDay','SecuCode'])df2.drop(['InnerCode','ID','UpdateTime','JSID'],axis=1,inplace=True)return df2.to_panel()code = get_industries(name='sw_l2').index[:5]df = get_sw_quote(code,end_date='2018-01-01',count=10)df.to_frame(False).tail()

/opt/conda/envs/python2new/lib/python2.7/site-packages/ipykernel_launcher.py:23: DeprecationWarning: 
Panel is deprecated and will be removed in a future version.
The recommended way to represent these types of 3-dimensional data are with a MultiIndex on a DataFrame, via the Panel.to_frame() method
Alternatively, you can use the xarray package http://xarray.pydata.org/en/stable/.
Pandas provides a `.to_xarray()` method to help automate this conversion.

.dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; }

		ChiName	PrevClosePrice	OpenPrice	HighPrice	LowPrice	ClosePrice	TurnoverVolume	TurnoverValue	TurnoverDeals	ChangePCT	RightLevel	IndexPE	IndexPB	TotalMarketValue	AShareTotalMV
TradingDay	SecuCode
2017-12-29	801011	申银万国指数-林业	1800.092	1795.480	1845.60	1784.41	1802.798	2853232.0	3.554701e+07	None	0.0015	4	777.37	3.13	2036129.0	1121708.0
801012	申银万国指数-农产品加工	2551.776	2546.600	2573.80	2515.61	2556.512	93670965.0	1.234257e+09	None	0.0019	4	35.58	3.12	14529320.0	7936090.0
801013	申银万国指数-农业综合	2269.610	2260.530	2278.20	2254.30	2272.480	9084945.0	4.234125e+07	None	0.0013	4	125.71	2.92	867493.0	464852.0
801014	申银万国指数-饲料	3644.520	3646.568	3709.26	3610.05	3685.350	133928157.0	1.452713e+09	None	0.0112	4	23.78	2.71	21027565.0	8124732.0
801015	申银万国指数-渔业	1540.178	1535.748	1566.71	1528.66	1555.760	34618204.0	3.012524e+08	None	0.0101	4	43.95	2.52	5217722.0	2511131.0

def get_zz_quote(code,end_date=None,count=None,start_date=None):'''获取中证指数行情,返回panel结构'''if isinstance(code,str):code=[code]code.sort()code = [x[:6] for x in code]days = get_trade_days(start_date,end_date,count)code_df = jy.run_query(query( jy.SecuMain.InnerCode,jy.SecuMain.SecuCode,jy.SecuMain.ChiName).filter(jy.SecuMain.SecuCode.in_(code)).order_by(jy.SecuMain.SecuCode))df = jy.run_query(query( jy.QT_CSIIndexQuote).filter(jy.QT_CSIIndexQuote.IndexCode.in_(code_df.InnerCode),jy.QT_CSIIndexQuote.TradingDay.in_(days),))df2  = pd.merge(code_df, df, left_on='InnerCode',right_on='IndexCode').set_index(['TradingDay','SecuCode'])df2.drop(['InnerCode','IndexCode','ID','UpdateTime','JSID','OpenInterest','SettleValue','IndexCSIType'],axis=1,inplace=True)return df2.to_panel()panel = get_zz_quote(['000016.XSHG','000001.XSHG'],end_date='2019-01-21',count=10)panel.ClosePrice.tail()

/opt/conda/envs/python2new/lib/python2.7/site-packages/ipykernel_launcher.py:19: DeprecationWarning: 
Panel is deprecated and will be removed in a future version.
The recommended way to represent these types of 3-dimensional data are with a MultiIndex on a DataFrame, via the Panel.to_frame() method
Alternatively, you can use the xarray package http://xarray.pydata.org/en/stable/.
Pandas provides a `.to_xarray()` method to help automate this conversion.

.dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; }

SecuCode	000001	000016
TradingDay
2019-01-15	2570.3449	2378.3696
2019-01-16	2570.4221	2381.2178
2019-01-17	2559.6370	2371.3481
2019-01-18	2596.0056	2417.3630
2019-01-21	2610.5094	2432.4870

查询股息率(近12个月)¶

from jqdata import *def DividendRatio(security_list,end_date,count=1):'''查询股息率(日更新)    输入:股票池,截止日期,获取数量    输出:panel结构,单位:1'''trade_days = get_trade_days(end_date=end_date,count = count)security_list.sort()secu_list = [x[:6] for x in security_list]code_df = jy.run_query(query( jy.SecuMain.InnerCode,jy.SecuMain.SecuCode,#     jy.SecuMain.ChiName,jy.SecuMain.CompanyCode).filter(jy.SecuMain.SecuCode.in_(secu_list),jy.SecuMain.SecuCategory==1).order_by(jy.SecuMain.SecuCode))code_df['code'] = security_listdf = jy.run_query(query(#         jy.LC_DIndicesForValuation    #得到整表jy.LC_DIndicesForValuation.InnerCode,jy.LC_DIndicesForValuation.TradingDay, jy.LC_DIndicesForValuation.DividendRatio,).filter(jy.LC_DIndicesForValuation.InnerCode.in_(code_df.InnerCode),jy.LC_DIndicesForValuation.TradingDay.in_(trade_days)))f_df = df.merge(code_df,on='InnerCode').set_index(['TradingDay','code']).drop(['InnerCode','SecuCode'],axis=1)panel = f_df.to_panel()return panelsecuritys =get_index_stocks('399015.XSHE')date='2018-01-01'DividendRatio(securitys,date)

/opt/conda/envs/python2new/lib/python2.7/site-packages/ipykernel_launcher.py:24: DeprecationWarning: 
Panel is deprecated and will be removed in a future version.
The recommended way to represent these types of 3-dimensional data are with a MultiIndex on a DataFrame, via the Panel.to_frame() method
Alternatively, you can use the xarray package http://xarray.pydata.org/en/stable/.
Pandas provides a `.to_xarray()` method to help automate this conversion.

<class 'pandas.core.panel.Panel'>
Dimensions: 1 (items) x 1 (major_axis) x 492 (minor_axis)
Items axis: DividendRatio to DividendRatio
Major_axis axis: 2017-12-29 00:00:00 to 2017-12-29 00:00:00
Minor_axis axis: 000007.XSHE to 300735.XSHE

获取期货合约的基本信息（合约乘数、商品报价的计数单位、最小变动单位）¶

# lru缓存可以不要，加上后查询相同的合约可以避免重复数据库请求from fastcache import clru_cache as lru_cache@lru_cache(maxsize=128)def future_basic_info(future):from jqdata import jyfrom numpy import nanimport reif "9999" in future or "8888" in future:match = re.match(r"(?P<underlying_symbol>[A-Z]{1,})", future)if not match:raise ValueError("未知期货标的：{}".format(future))else:future = get_dominant_future(match.groupdict()["underlying_symbol"])q = query(jy.Fut_ContractMain).filter(jy.Fut_ContractMain.ContractCode == future.split(".")[0])result = jy.run_query(query_object=q).to_dict("record")if result:result = result.pop()min_point = re.match("(?P<value>^[0-9]+([.]{1}[0-9]+){0,1})", result["LittlestChangeUnit"]).groupdict(nan)["value"]return {"ContractUnit": result["CMValue"],   "PriceScale": float(str(min_point)[:-1] + "1") if float(min_point) < 1 else 1,   "MinPoint": float(min_point)}else:return {"ContractUnit": nan,"PriceScale": nan,"MinPoint": nan}future_basic_info('IF1801.CCFX')

{'ContractUnit': 300, 'MinPoint': 0.2, 'PriceScale': 0.1}

多个dataframe的合并 (数据拼接) 获取多份财务报表¶

#获取多年的季度度数据import pandas as pddef get_more_state_fund(q_object,year_list):df_list = []for year in year_list:rets = [get_fundamentals(q, statDate=str(year)+'q'+str(i)) for i in range(1, 5)]df = pd.concat(rets).set_index('statDate')  #个人习惯df_list.append(df)df_ = pd.concat(df_list,keys=year_list,axis=0) #axis=1或axis=0,依据个人习惯return df_q = query(indicator.code,  indicator.statDate,indicator.roe,indicator.inc_return,indicator.pubDate,
  ).filter(  income.code.in_(['000001.XSHE','600507.XSHG']))df = get_more_state_fund(q,['2017','2018'])# df.loc[('2014',slice(None))]df

.dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; }

		code	roe	inc_return	pubDate
	statDate
2017	2017-03-31	000001.XSHE	3.0319	3.0309	2017-04-22
2017-03-31	600507.XSHG	10.2024	10.0997	2017-04-22
2017-06-30	000001.XSHE	3.0249	3.0058	2017-08-11
2017-06-30	600507.XSHG	14.0929	14.2133	2017-08-19
2017-09-30	000001.XSHE	3.0724	3.0813	2017-10-21
2017-09-30	600507.XSHG	24.2866	23.9243	2017-10-24
2017-12-31	000001.XSHE	1.8339	1.8320	2018-03-15
2017-12-31	600507.XSHG	22.9191	19.8184	2018-02-09
2018	2018-03-31	000001.XSHE	2.9611	2.9431	2018-04-20
2018-03-31	600507.XSHG	13.6345	13.6067	2018-04-21
2018-06-30	000001.XSHE	3.0000	3.0000	2018-08-16
2018-06-30	600507.XSHG	19.9900	19.7200	2018-08-18
2018-09-30	000001.XSHE	3.0600	3.0300	2018-10-24
2018-09-30	600507.XSHG	21.2400	21.1700	2018-10-25

#h获取多个年度财务数据import pandas as pdstock_list=['000001.XSHE','000527.XSHE','600970.XSHG','600875.XSHG','600880.XSHG']years = range(2005,2018)df_list=[]for year in years:df_year= get_fundamentals(query( indicator.code, indicator.inc_return,indicator.roe   ).filter( indicator.code.in_(stock_list)   ),statDate=year).set_index('code')df_list.append(df_year)name = [str(x) for x in range(2005,2017)]df=pd.concat(df_list,axis=1,keys=name)df

.dataframe thead tr:only-child th { text-align: right; } .dataframe thead th { text-align: left; } .dataframe tbody tr th { vertical-align: top; }

	2005	2006	2007	2008	2009	...	2012	2013	2014	2015	2016
	inc_return	roe	inc_return	roe	inc_return	roe	inc_return	roe	inc_return	roe	...	inc_return	roe	inc_return	roe	inc_return	roe	inc_return	roe	inc_return	roe
000001.XSHE	6.8734	6.3868	21.2849	22.6025	26.7708	27.2057	4.2435	4.1761	26.7942	27.2887	...	16.9316	16.9537	15.4064	15.4724	16.3280	16.2959	14.9783	14.9530	12.4321	12.4283
000527.XSHE	13.3490	13.3300	15.7667	16.3908	26.3179	29.8230	24.7264	22.3681	26.2619	26.6314	...	16.2383	16.5337	NaN	NaN	NaN	NaN	NaN	NaN	NaN	NaN
600875.XSHG	34.1418	34.0314	22.7201	57.1331	16.2666	47.2531	56.7490	6.7020	30.2382	28.6972	...	13.7332	14.8649	13.4276	14.0803	6.2436	6.8654	1.4852	2.0616	-8.9718	-8.0745
600880.XSHG	18.1740	18.5634	21.3198	12.3548	19.8543	20.2199	18.9755	19.9203	20.9028	22.6681	...	13.1731	13.2258	11.4463	11.7852	8.0076	7.9088	0.7765	2.0698	-0.7119	1.6685
600970.XSHG	15.8088	17.5163	14.8404	30.9030	25.1281	28.2336	43.4164	30.6846	48.4384	47.1297	...	16.3465	17.0652	-0.5751	2.0248	2.2335	3.4041	8.9119	12.2016	4.3695	7.6902

5 rows × 24 columns

需要注意的: finance,jy,opt,macro有一次性获取数据不超过3000条的限制(主要是为了提速,get_fundamentals的限制为10000条),

所以当获取的数据超过这个限制时,需要对股票池或者时间段进行分割,多次获取。

全部回复

0/140

本社区仅针对特定人员开放

查看需注册登录并通过风险意识测评

5秒后跳转登录页面...

达人推荐

关注
FX168投资英雄

粉丝:914

帖子数:0

FX168投资英雄

0帖子30关注914粉丝

关注拉黑私信
关注
玉兔呈祥

粉丝:555

帖子数:3

玉兔呈祥:2019年08月23日操盘计划(1)...

玉兔呈祥:2019年07月26日操盘计划(1)...

玉兔呈祥

3帖子0关注555粉丝

关注拉黑私信
关注
临沂石老师

粉丝:565

帖子数:0

临沂石老师

0帖子12关注565粉丝

关注拉黑私信

量化课程

移动端课程

量化交易吧 / 源码分享 帖子：3383531 新帖：0

Query对象的简单使用教程

耶伦发表于：5 月 9 日 17：49回复(1)

涉及到使用数据库操作的数据有:

基本的查询方式

基本的查询方式¶

get_fundamentals 获取某个/多个季度/年度的数据(指定参数statDate)¶

in_ 判断某个字段的值是否在列表之中(一般用于查询多个标的)¶

distinct 去重,用于查看数据库中某个字段都存在哪些值¶

~ 反向查询符号¶

运算和命名(label)¶

or_ 或查询¶

contains/like/ilike 数据库中的字符串模糊匹配¶

简化计算的方法(sqlalchemy.sql.func) 和 group_by¶

其他常用数据获取方式整理¶

获取旧的申万指数列表和新的申万指数列表(14年有过改动)¶

查询申万行情( 包含行业pe/pb)¶

查询股息率(近12个月)¶

获取期货合约的基本信息（合约乘数、商品报价的计数单位、最小变动单位）¶

多个dataframe的合并 (数据拼接) 获取多份财务报表¶

需要注意的: finance,jy,opt,macro有一次性获取数据不超过3000条的限制(主要是为了提速,get_fundamentals的限制为10000条),

所以当获取的数据超过这个限制时,需要对股票池或者时间段进行分割,多次获取。

全部回复

0/140

粉丝:914

帖子数:0

粉丝:555

帖子数:3

粉丝:565

帖子数:0

量化课程

热门标签

删除回复

确认要删除这篇文章么？

举报用户

信息提示

该文章已删除

设置置顶

完成设置【置顶】！

设置置顶

已取消设置【置顶】！

设置精华

完成设置【精华】！

设置精华

已取消设置【精华】！

审核信息

该文章已审核通过

审核信息

您已设置该文章审核不通过

举报成功

您已举报成功

用户登录

移动帖子

创建私信

屏蔽提示

确认要屏蔽该用户么？

屏蔽回复

您已对该用户实现屏蔽

信息回复

已发送成功

量化交易吧 / 源码分享帖子：3383531 新帖：0