繁簡切換您正在訪問的是FX168財經網,本網站所提供的內容及信息均遵守中華人民共和國香港特別行政區當地法律法規。

FX168财经网>人物频道>帖子

共享函数 | 外部数据获取类

作者/jedfsjfjsdf 2019-05-09 20:27 0 来源: FX168财经网人物频道

1.概述

本帖收集了小伙伴们分享的获取数据方法,仅供学习及交流使用,无利益相关。
本系列将持续更新,标题为 【共享函数】


2.包含的函数:

爬取新浪热门股票(by. 股票疯赢)
爬取选股宝涨停原因(by.包希仁)
抓取港股新股数据统计打新收益(by.止一之路)
爬取申万官网行业行情/估值数据(by.ssk)
爬虫获取国债收益率数据(by.tinysnowing )

爬取新浪热门股票¶

作者:股票疯赢 

import requestsimport anyjsonimport pandas as pddef get_hot_stock_from_sina():'''从新浪得到热门数据'''html = requests.get('https://ssl-data.sina.com.cn/api/openapi.php/WeiboReferService.getListSymbol?code=CNHOUR6&callback=var%20AHM=').content.decode()  n = html[html.index('(')+1:html.index(')')]h = anyjson.deserialize(n)data = pd.DataFrame(h['result']['data'])data.SYMBOL = data.SYMBOL.apply(normalize_code)return dataget_hot_stock_from_sina().head()

.dataframe tbody tr th:only-of-type {        vertical-align: middle;    }    .dataframe tbody tr th {        vertical-align: top;    }    .dataframe thead th {        text-align: right;    }


NAMEREFSYMBOL
0东方通信891768600776.XSHG
1银之杰735779300085.XSHE
2东方财富654869300059.XSHE
3网宿科技592289300017.XSHE
4安控科技498015300370.XSHE

爬取选股宝涨停原因¶

作者: 包希仁 

import urllibimport jsonimport pandas as pddef Xuangubao():url = "https://flash-api.xuangubao.cn/api/pool/detail?pool_name=limit_up"  #涨停#     url = 'https://flash-api.xuangubao.cn/api/pool/detail?pool_name=limit_up_broken'  #炸板header_dict = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko'}#     req = urllib2.Request(url=url, headers=header_dict)#     df = pd.DataFrame(json.loads(urllib2.urlopen(req).read())['data'])req = urllib.request.Request(url,headers = header_dict)df = pd.DataFrame(json.loads(urllib.request.urlopen(req).read())['data'])df['stock_reason'] = df.surge_reason.apply(lambda x: x['stock_reason'])df['plate_name'] = df.surge_reason.apply(lambda x: x['related_plates'][0]['plate_name'])def get_plate_reason(x):try: return x['related_plates'][0][u'plate_reason']except:returndf['plate_reason'] = df.surge_reason.apply(get_plate_reason)df['limit_timeline'] = df.limit_timeline.apply(lambda x: datetime.datetime.fromtimestamp(x['items'][0]['timestamp']))df.index = df.surge_reason.apply(lambda x: normalize_code(x['symbol']))df.index.name=Nonereturn df.drop('surge_reason',axis=1)Xuangubao().head()

.dataframe tbody tr th:only-of-type {        vertical-align: middle;    }    .dataframe tbody tr th {        vertical-align: top;    }    .dataframe thead th {        text-align: right;    }


break_limit_down_timesbreak_limit_up_timesbuy_lock_volume_ratiochange_percentfirst_break_limit_downfirst_break_limit_upfirst_limit_downfirst_limit_upis_new_stockissue_pricelast_break_limit_downlast_break_limit_uplast_limit_downlast_limit_uplimit_down_dayslimit_timelinelimit_up_dayslisted_datem_days_n_boards_boardsm_days_n_boards_daysmtmnearly_new_acc_*nearly_new_break_daysnew_stock_acc_*new_stock_break_limit_upnew_stock_limit_up_daysnew_stock_limit_up_price_before_brokennon_restricted_capitalpricesell_lock_volume_ratiostock_chi_namesymboltotal_capitalturnover_ratiovolume_bias_ratioyesterday_break_limit_up_timesyesterday_first_limit_upyesterday_last_limit_upyesterday_limit_down_daysyesterday_limit_up_daysstock_reasonplate_nameplate_reason
002450.XSHE000.0091740.0503820001551662703False14.20000155166270302019-03-04 09:25:03512792096009150.00.00-0.515493000.02.224831e+106.880ST康得新002450.SZ2.436139e+100.0007680.015875015514035031551403503042018年度实现净利润4.02亿元ST股年报披露高峰期,扭亏个股有望摘帽
300538.XSHE000.1872400.1001100001551662703False15.85000155166270302019-03-04 09:25:0331472140800330.00.000.899685000.07.225360e+0830.110同益股份300538.SZ2.538039e+090.0465960.6307240155140350315514035030218年年报10转8高送转None
002207.XSHE080.0009310.0503250155167947901551679467False7.85015516819000155168198102019-03-04 14:04:2711201449600360.00.00-0.175796000.01.536120e+096.470ST准油002207.SZ1.547478e+090.0444711.64192331551405519155140643100主营石油技术服务、建筑*、运输服务和化工产品销售,属于上游石油天然气采掘服务业ST股年报披露高峰期,扭亏个股有望摘帽
002552.XSHE040.0012810.0506490155168120701551681189False20.00015516814440155168211002019-03-04 14:33:0911298563200000.00.00-0.595500000.01.619205e+098.090*ST宝鼎002552.SZ2.477420e+090.0302131.32643500000三季报扭亏,主营大型铸锻件ST股年报披露高峰期,扭亏个股有望摘帽
000727.XSHE020.0071090.0995850155166381301551663003False6.16015516667500155166750002019-03-04 09:30:031864057600480.00.00-0.569805000.07.766236e+092.650华东科技000727.SZ1.200335e+100.1254761.18371500000广东聚华印刷显示技术有限公司为参股公司,目前聚华公司建成了“国家印刷及柔性显示创新中心”,开...柔性屏华为发布MATE X折叠屏手机

抓取港股新股数据统计打新收益¶

作者:止一之路   

内容较多,请直接点击原文链接查看¶

爬取申万官网行业行情/估值数据¶

作者: ssk  

#获取申万官网申万行业数据#导入库import numpy as npimport pandas as pdimport requestsimport jsonfrom datetime import timedelta,date# 获取申万官网申万行业数据# code:行业代码  https://www.joinquant.com/help/api/help?name=plateData#申万行业# frequency:day/week/month# start_date:None(表示最早日期)# end_date:None(表示今天日期)# fields:None(表示所有字段)def get_sw_data(code=None,start_date=None,end_date=None,frequency='day',fields=None): #headersheader={'HOST':'www.swsindex.com','Referer':'http://www.swsindex.com/idx0200.aspx?columnid=8838&type=Day','User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) \    Chrome/53.0.2785.104 Safari/537.36 Core/1.53.4482.400 QQBrowser/9.7.13001.400'}#传入参数param={'tablename':'V_Report','key':'id',#页面序号,每页返回20条数据'p':'1',#查询语句,查询的代码、日期、数据类型"where":"swindexcode in ('801020') and   BargainDate>='2018-04-02' and  BargainDate<='2018-04-24' and type='Day'",#排序(swindexcode asc表示按照代码升序,BargainDate_1表示按照日期降序,_2表示按照升序)'orderby':'swindexcode asc,BargainDate_2',#返回的字段'fieldlist':'SwIndexCode,SwIndexName,BargainDate,OpenIndex,CloseIndex,MaxIndex,MinIndex,BargainAmount,BargainSum,Markup,TurnoverRate,\    PE,PB,MeanPrice,BargainSumRate,NegotiablesShareSum,NegotiablesShareSum2,DP','pagecount':'1','timed':'1524497094532',}#数据表表头sw_columns_list=['SwIndexCode','SwIndexName','BargainDate','OpenIndex','CloseIndex','MaxIndex','MinIndex','BargainAmount','BargainSum', 'Markup','TurnoverRate','PE','PB','MeanPrice','BargainSumRate','NegotiablesShareSum','NegotiablesShareSum2','DP']#数据类型(日、周、月)frequency_list=['day','week','month']#配置查询语句where="swindexcode in ("if code is None:#如果代码为空,则代码为代码列表code='801010'else:    if type(code)==list:code_str=str(code).replace('[','').replace(']','')if type(code)==str:code_str="'"+code+"'"where+=code_str   
    #配置日期today_str=pd.datetime.today().strftime('%Y-%m-%d')if (start_date is None) or (start_date<'1999-12-30') or (start_date>today_str):start_date='1999-12-30'where+=") and BargainDate>='"     where+=start_dateif (end_date is None) or (end_date>today_str) or (end_date<'1999-12-30'):end_date=today_strwhere+="' and BargainDate<='" where+=end_date  
    #配置数据类型if not(frequency in frequency_list):  frequency='day'where+="' and type='"where+=frequencywhere+="'"param['where']=where 
    #配置字段columns=sw_columns_listfieldlist=str(sw_columns_list).replace(" ","").replace("'","").replace('[',"").replace(']',"")   if not(fields is None):if(set(fields).issubset(set(sw_columns_list))):  if not (['SwIndexCode','SwIndexName','BargainDate'] in fields):fields=['SwIndexCode','SwIndexName','BargainDate']+fieldsfieldlist=str(fields).replace(" ","").replace("'","").replace('[',"").replace(']',"") columns=fieldsparam['fieldlist']=fieldlistdf=pd.DataFrame()#urlurl='http://www.swsindex.com/handler.aspx'#页面计数器page=1while True:#获取数据ret=requests.get(url,data=param,headers=header)if not (ret.ok is True):break#整理引号、日期格式    data=ret.text.replace("'", '"').replace(' 0:00:00','').replace('/','-')#解析数据data=json.loads(data).get('root')if len(data)==0:break#追加数据表    df=df.append(pd.DataFrame(data,columns=columns))#设置页面计数器page+=1param['p']=str(page)    if len(df)!=0:   df.BargainDate=pd.to_datetime(df.BargainDate,format='%Y-%m-%d')#返回数据return df df=get_sw_data('850111',start_date='2019-02-23')df

.dataframe tbody tr th:only-of-type {        vertical-align: middle;    }    .dataframe tbody tr th {        vertical-align: top;    }    .dataframe thead th {        text-align: right;    }


SwIndexCodeSwIndexNameBargainDateOpenIndexCloseIndexMaxIndexMinIndexBargainAmountBargainSumMarkupTurnoverRatePEPBMeanPriceBargainSumRateNegotiablesShareSumNegotiablesShareSum2DP
0850111种子生产2019-02-252493.652603.852612.972469.67185441090294.573.604345.492.686.700.103496540.23437067.530.57
1850111种子生产2019-02-262601.412577.022643.982534.8320089115323-1.033.904745.022.656.650.113470405.45433800.680.58
2850111种子生产2019-02-272571.522547.742603.462530.191365178331-1.142.653344.512.626.580.093430769.77428846.220.59
3850111种子生产2019-02-282550.002559.182584.132523.738255503260.451.604444.712.646.620.083449990.37431248.800.58
4850111种子生产2019-03-012567.252570.262590.562519.639037532910.431.756444.912.656.640.083462555.34432819.420.58
5850111种子生产2019-03-042581.012590.012636.932559.2714772919840.772.871145.252.676.700.093489184.93436148.120.58
6850111种子生产2019-03-052588.812651.592662.002560.3816510940812.383.209046.332.736.890.113577580.69447197.590.56
7850111种子生产2019-03-062669.462688.302715.862620.91192281150491.383.737346.972.776.980.103627549.76453443.720.56
8850111种子生产2019-03-072691.752723.262791.492645.81188511154971.303.664047.582.817.120.103680740.95460092.620.55
9850111种子生产2019-03-082672.422600.622751.952574.5818283105621-4.503.553645.442.686.730.093506135.99438267.000.58
10850111种子生产2019-03-112602.072717.112721.982593.2015197911434.482.953847.472.807.140.103682245.16460280.650.55
11850111种子生产2019-03-122738.212723.112783.132673.28179731153380.223.493347.582.817.150.103686361.67460795.210.55
12850111种子生产2019-03-132755.562686.942826.682650.8520313126599-1.333.948246.952.777.050.123631513.24453939.150.56
13850111种子生产2019-03-142658.122567.432679.732527.971359278529-4.452.641944.682.656.720.103469352.20433669.020.59
14850111种子生产2019-03-152578.162594.872640.552554.479537616401.071.853645.162.676.830.083508430.54438553.820.58
15850111种子生产2019-03-182617.852751.262759.712587.8912125986336.032.356747.882.837.100.123701939.95462742.490.55
16850111种子生产2019-03-192776.052799.752830.072727.96119111080911.762.315148.722.887.170.143752415.20469051.900.54
17850111种子生产2019-03-202792.862796.132858.792746.401216590645-0.132.364548.662.887.170.123751792.49468974.060.54

爬虫获取国债收益率数据¶

作者:tinysnowing 

import requestsimport jsonimport pandas as pdimport timefrom sqlalchemy import create_enginedef get_bnd_yield(year=10):ids = {10: '29227', 5: '29234', 1: '29231'}url = 'https://cn.investing.com/common/modules/js_instrument_chart/api/data.php?' + \'pair_id={}&pair_id_for_news={}'.format(ids[year], ids[year]) +\'&chart_type=area&pair_interval=month&candle_count=120&events=yes&volume_series=yes&period=5-years'headers = {}headers['X-Requested-With'] = 'XMLHttpRequest'headers['Host'] = 'cn.investing.com'headers['Referer'] = 'https://cn.investing.com/rates-bonds/china-{}-year-bond-yield'.format(year)headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64)'res = requests.get(url, headers=headers)res = json.loads(res.content.decode('utf-8').replace("'", "\""))data = pd.DataFrame(res['candles'])data = data.iloc[:, :2]data.columns = ['date', 'y'+str(year)]data['date'] = data['date'].map(lambda x: time.strftime("%Y-%m-%d", time.localtime(int(str(x)[:10]))))data.set_index('date', inplace=True)return datadef get_bnd_yields(years=[1, 5, 10]):bag = pd.DataFrame()for yr in years:bag = pd.concat([bag, get_bnd_yield(year=yr)], axis=1)#print(bag.head())return bag
df = get_bnd_yields()df

.dataframe tbody tr th:only-of-type {        vertical-align: middle;    }    .dataframe tbody tr th {        vertical-align: top;    }    .dataframe thead th {        text-align: right;    }


y1y5y10
date


2014-04-013.6504.1604.330
2014-05-013.3604.0104.160
2014-06-013.3703.8604.060
2014-07-013.7634.0314.298
2014-08-013.7993.9984.248
2014-09-013.7673.9314.028
2014-10-013.4063.5653.786
2014-11-013.0703.4133.546
2014-12-013.2633.5383.648
2015-01-013.1513.4093.514
2015-02-013.0773.2573.379
2015-03-013.1903.4793.623
2015-04-012.8693.2883.422
2015-05-011.9603.2533.591
2015-06-011.7673.2473.629
2015-07-012.2463.1783.474
2015-08-012.3073.1653.394
2015-09-012.3603.0873.276
2015-10-012.3792.9033.087
2015-11-012.5992.9263.088
2015-12-012.3292.7132.862
2016-01-012.3932.7892.909
2016-02-012.2792.6552.909
2016-03-012.1632.5312.886
2016-04-012.2182.7692.946
2016-05-012.3382.7672.995
2016-06-012.3902.7002.875
2016-07-012.2402.6062.805
2016-08-012.1502.5942.805
2016-09-012.1852.5652.769
2016-10-012.1902.4802.744
2016-11-012.3002.7652.943
2016-12-012.7512.8833.066
2017-01-012.7703.0373.363
2017-02-012.7833.0003.358
2017-03-012.8853.0853.310
2017-04-013.1603.3473.477
2017-05-013.4753.6533.670
2017-06-013.4533.5023.578
2017-07-013.4283.5743.629
2017-08-013.4283.6353.675
2017-09-013.4603.6303.638
2017-10-013.5833.9633.916
2017-11-013.7003.8763.917
2017-12-013.8033.8603.915
2018-01-013.5833.8453.944
2018-02-013.3133.7593.857
2018-03-013.3503.6903.778
2018-04-013.0073.1753.653
2018-05-013.1853.4523.646
2018-06-013.2103.4103.543
2018-07-012.8933.2273.533
2018-08-012.8363.3863.600
2018-09-012.9903.4703.655
2018-10-012.8373.3643.533
2018-11-012.6453.1683.398
2018-12-012.5753.0143.270
2019-01-012.4152.9233.130
2019-02-012.4093.0333.208
2019-03-012.4453.0403.148
 
分享到:
举报财经168客户端下载

全部回复

0/140

投稿 您想发表你的观点和看法?

更多人气分析师

  • 张亦巧

    人气2144文章4145粉丝45

    暂无个人简介信息

  • 梁孟梵

    人气2152文章3177粉丝39

    qq:2294906466 了解群指导添加微信mfmacd

  • 指导老师

    人气1856文章4423粉丝52

    暂无个人简介信息

  • 李冉晴

    人气2296文章3821粉丝34

    李冉晴,专业现贷实盘分析师。

  • 刘钥钥1

    人气2016文章3119粉丝34

    专业从事现货黄金、现货白银模似实盘操作分析指导

  • 张迎妤

    人气1896文章3305粉丝34

    个人专注于行情技术分析,消息面解读剖析,给予您第一时间方向...

  • 金泰铬J

    人气2320文章3925粉丝51

    投资问答解咨询金泰铬V/信tgtg67即可获取每日的实时资讯、行情...

  • 金算盘

    人气2696文章7761粉丝125

    高级分析师,混过名校,厮杀于股市和期货、证券市场多年,专注...

  • 金帝财神

    人气4728文章8329粉丝118

    本文由资深分析师金帝财神微信:934295330,指导黄金,白银,...