Voting resources, early voting, and poll worker information - VOTE. ... Adafruit is open and shipping.
0

Row Limit Python data retrirval
Moderators: adafruit_support_bill, adafruit

Forum rules
If you're posting code, please make sure your code does not include your Adafruit IO Active Key or WiFi network credentials.
Please be positive and constructive with your questions and comments.

Row Limit Python data retrirval

by bayesp on Mon Apr 20, 2020 1:50 pm

HI

Is there a row limit when fetching data using the Adafruit io python library? If I pull back a feed to an array and then from th array to a dataframe, I only get 1000 rows?

Thanks
bayesp
 
Posts: 6
Joined: Tue Jan 15, 2013 8:22 am

Re: Row Limit Python data retrirval

by brubell on Tue Apr 21, 2020 9:40 am

Could you please post your python code file along with an example of the output? Thanks

brubell
 
Posts: 1040
Joined: Fri Jul 17, 2015 10:33 pm

Re: Row Limit Python data retrirval

by bayesp on Tue Apr 21, 2020 11:50 am

Hi here is the code with my credentials redacted:
Code: Select all | TOGGLE FULL SIZE
!pip install adafruit-io

from Adafruit_IO import Client

aio = Client('xxxxxxxx','xxxxxxxx')

data = aio.data('enviro')
import pandas as pd

df= pd.DataFrame(data)

import altair as alt

df.sort_values(by=['created_epoch'], inplace=True, ascending=False)
df['created_at'] = df['created_at'].astype('datetime64[ns]')

source = df

alt.Chart(source).mark_line().encode(
    x='created_at:T',
    y='value:Q'
)

df.head()


df.head
created_epoch created_at updated_at value completed_at feed_id expiration position id lat lon ele
0 1587483858 2020-04-21 15:44:18 None 21.84 None 1356732 2020-05-21T15:44:18Z None 0EE0K1TPK75WG22VC91PCY8MS3 None None None
1 1587483847 2020-04-21 15:44:07 None 21.84 None 1356732 2020-05-21T15:44:07Z None 0EE0K1QAEBPZT8Z5CMP53ST4YD None None None
2 1587483836 2020-04-21 15:43:56 None 21.84 None 1356732 2020-05-21T15:43:56Z None 0EE0K1KZ1PW8TQKKEKBQBX8BDD None None None
3 1587483825 2020-04-21 15:43:45 None 21.84 None 1356732 2020-05-21T15:43:45Z None 0EE0K1GK62VQ788G03R5NVGRQ1 None None None
4 1587483814 2020-04-21 15:43:34 None 21.84 None 1356732 2020-05-21T15:43:34Z None 0EE0K1D7V1EEG7KBQ3MEY9F8B6 None None None

df.describe

created_epoch feed_id
count 1.000000e+03 1000.0
mean 1.587478e+09 1356732.0
std 3.183911e+03 0.0
min 1.587473e+09 1356732.0
25% 1.587476e+09 1356732.0
50% 1.587478e+09 1356732.0
75% 1.587481e+09 1356732.0
max 1.587484e+09 1356732.0
bayesp
 
Posts: 6
Joined: Tue Jan 15, 2013 8:22 am

Re: Row Limit Python data retrirval

by brubell on Wed Apr 29, 2020 12:33 pm

I'm not sure why you're hitting a row limit. You could try downloading the feed as a CSV or JSON, and parse through it:

downloaddata.png
downloaddata.png (16.66 KiB) Viewed 78 times

brubell
 
Posts: 1040
Joined: Fri Jul 17, 2015 10:33 pm

Re: Row Limit Python data retrirval

by MoogMan1073 on Sat Jul 04, 2020 8:16 pm

Hi, I've had the exact same problem: only 1000 rows of data can be retrieved at a given time using the Client.data('feed') python function.

1000 seems like a very unlikely number to reach by chance. It does appear to be a limit defined somewhere. If this limit is documented anywhere in the python docs for Adafruit IO, it is not obvious.

I understand I could simply parse through downloaded data in a csv but I would highly prefer to write a single python application to conduct automated analytics and reports without any human intervention. Is there any way around this limit? If not, I suppose the next best thing would be downloading 1000 data points periodically, storing the data locally and then create reports automatically once the desired time interval has been reached.

However, I feel like I should be able to simply pull more than 1000 data points at a given time and that be the end of it. Is there any way to disable this 1000 data point limit or modify it somehow?

MoogMan1073
 
Posts: 1
Joined: Sun Jan 10, 2016 8:32 pm

Re: Row Limit Python data retrirval

by brubell on Mon Jul 06, 2020 9:30 am

MoogMan1073 wrote:Hi, I've had the exact same problem: only 1000 rows of data can be retrieved at a given time using the Client.data('feed') python function.

1000 seems like a very unlikely number to reach by chance. It does appear to be a limit defined somewhere. If this limit is documented anywhere in the python docs for Adafruit IO, it is not obvious.

I understand I could simply parse through downloaded data in a csv but I would highly prefer to write a single python application to conduct automated analytics and reports without any human intervention. Is there any way around this limit? If not, I suppose the next best thing would be downloading 1000 data points periodically, storing the data locally and then create reports automatically once the desired time interval has been reached.

However, I feel like I should be able to simply pull more than 1000 data points at a given time and that be the end of it. Is there any way to disable this 1000 data point limit or modify it somehow?


Could you please post the code you're using to pull down the rows of data from IO?

brubell
 
Posts: 1040
Joined: Fri Jul 17, 2015 10:33 pm

Re: Row Limit Python data retrirval

by Weather411 on Tue Jul 07, 2020 10:14 pm

I'm glad to find this post. I agree that the 1000 limit should be waived, especially for paid subscribers to AIO+. Maybe impose a 20,000 record limit? What I do to get around this is to run my script once per day (Windows Scheduler) and pull the last 24 hours of feed data. To answer a previous question raised here, I found this out while troubleshooting via the AIO API Python documentation:

https://io.adafruit.com/api/docs/?python#pagination

The main function of my Python script that goes out to get the data is (hopefully this will save some people time as I did the trial and error):

Code: Select all | TOGGLE FULL SIZE
def http_get_req(feed_key):
    """Start web request for data from Adafruit IO. "feed_key" argument is key of the "feed_names" dictionary, defined globally at the top. Generates a dictionary/JSON list per feed. "start_dt", "end_dt" and "http_get_req_timeout" are defined globally but could be used as function arguments as well."""
    headers = {"X-AIO-Key": "PUT_YOUR_AIO_KEY_HERE"}
    params = (("start_time", start_dt), ("end_time", end_dt))
    payload = json.loads(
        requests.get(
            "https://io.adafruit.com/api/v2/PUT_YOUR_AIO_USERNAME_HERE/feeds/" + feed_key + "/data",
            headers=headers,
            params=params,
            timeout=http_get_req_timeout,
        ).text
    )
    return payload


Globally define your feeds via dictionary (example below):

Code: Select all | TOGGLE FULL SIZE
#### FEED KEY : FEED NAME ####
feed_names = {
    "wxst-outdr-1.temp-f": "Temp_F",
    "wxst-outdr-1.hum-per": "Hum_per",
    "wxst-outdr-1.voltage-v": "Voltage_V",
    "wxst-outdr-1.current-ma": "Current_mA",
    "wxst-outdr-1.power-mw": "Power_mW",
    "wxst-outdr-1.rssi-dbm": "RSSI_dBm",
}


Then it's a matter of iterating...

Code: Select all | TOGGLE FULL SIZE
    #### ITERATE OVER THE AIO FEEDS FROM DICTIONARY DEFINED GLOBALLY ####
    for feed_key, feed_name in feed_names.items():

        #### RUN FUNCTION TO GET A LIST OF DICTIONARIES FOR FEED NAME ####
        data_dump = http_get_req(feed_key)

        #### KEEP TRACK OF NUMBER OF RECORDS, INITIALLY START AT 0 ####
        counter = 0

        #### ITERATE OVER THE FUNCTION RESULT - WHICH IS A DICTIONARY/JSON LIST PER FEED ####
        for feed_data in data_dump:

            #### ADVANCE THE COUNTER ####
            counter += 1

            #### GRAB THE DATA FOR EACH FEED ####
            aio_id = feed_data["id"]
            aio_dt = feed_data["created_at"].replace("T", " ").replace("Z", "")
            aio_value = feed_data["value"]
            aio_record = [aio_id, aio_dt, aio_value]


I then create CSV file for each feed and append until the next month, at which point I create a new file. I also write a log file that includes things like start/end and when script was run, how long it took script to run, and a record count for each feed. If anybody is interested, I'll show this...but I'm getting off topic. I seriously think Adafruit should consider lifting the 1,000 record maximum, assuming their servers can handle it (I assume so). I don't use MQTT b/c this (HTTP GET Python API) is more than adequate for my needs. Python is awesome once you get the hang of it. Thanks for reading!

Weather411
 
Posts: 7
Joined: Fri Sep 15, 2017 7:38 pm

Please be positive and constructive with your questions and comments.