WebRead CSV Missing Rows show in browser

1 Ansicht (letzte 30 Tage)
Ryan Klingert
Ryan Klingert am 30 Jan. 2020
Beantwortet: Guillaume am 7 Feb. 2020
Intro:
I want to start by saying I am pretty new to webscraping and while I have had some sucess working with HTML and the string editing functions I havent been able to figure out downloadin a table.
Backgroung:
The overal background is that I am working on a project to build a roster picking model for daily fantasey sports. There are several websites, including the one that i am using, which have relativly acurate projection for each players projected daily points. In order to backtest my model I need to collect projections from past season and so am trying to scrape this site.
Question:
This site displays a table of historical results, it also has a link to download these results as a CSV: https://rotogrinders.com/projected-stats/nhl-skater.csv?site=draftkings&date=2019-12-12
The issue is that when visiting that link in a web browser you get a csb with 100's of rows, matching the html page, however when you try to use Webread to systematicly download and save the CSV you only get a slect few of those rows. Code is posted below.
any help would be great!!!!!!
options = weboptions('Timeout',15);
date = datetime(2019,12,12)
useDay = char(string(day(date)));
if size(useDay,2) == 1
useDay = '0' + string(useDay);
end
useMonth = char(string(month(date)));
if size(useMonth,2) == 1
useMonth = '0' + string(useMonth);
end
html = webread('https://rotogrinders.com/projected-stats/nhl-skater.csv?site=draftkings&date=' + string(year(date)) + '-' + string(useMonth) + '-' + string(useDay) ,options);
  1 Kommentar
Rik
Rik am 30 Jan. 2020
I suspect the title might have triggered the spam filter. A word of advice: remove all non-Matlab relevant content. The point of your question is that webread doesn't download the same csv as you see in your browser, so that is the only relevant part for the question title.

Melden Sie sich an, um zu kommentieren.

Antworten (1)

Guillaume
Guillaume am 7 Feb. 2020
If I try to download the file from your link, using a web browser, I only get a few rows. Considering that when you visit the main webpage you get a prominent banner telling you you can only see rosters when a premium user, the problem seems clear: You need to be logged in order to download the full file.
Modifying your weboptions to specify username/password should work (assuming the website is designed properly):
options = weboptions('Timeout',15, 'Username', '??', 'Password', '***');
%rest of code as is...

Kategorien

Mehr zu Downloads finden Sie in Help Center und File Exchange

Tags

Produkte

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by