-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add get forex data #258
base: master
Are you sure you want to change the base?
Add get forex data #258
Conversation
…add_get_forex_data
…add_get_forex_data
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this PR! 😁
Starting with some initial comments around whether there's a way to do the data pull with a date filter right away so that the processing time can be shortened. At the moment, it seems to take very long. Also, added a comment about including get_forex_data
in the init.py of the data
module.
python/fastquant/data/forex/tests.py
Outdated
@@ -0,0 +1,127 @@ | |||
import unittest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move this file to fastquant/python/tests/
? This is where we store the tests and run them via pytest
. I'm not sure if this will automatically run as a class via pytest
, but if now, we can address that in a later PR 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, it's a unittest file, I‘ve never used pytest
. I can rewrite it use pytest later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, unittest
works fine but we have to convert this to use pytest
for consistency. And we make sure that this test is also run in Travis-CI. @Boreas514 , pytest
structure is simpler in that you don't have to define classes but only functions and assert statements (see tests).
|
||
logger.info(f'start downloading forex {symbol} data zip file...') | ||
file_size = 0 | ||
res = requests.get(f'http://www.forextester.com/templates/data/files/{symbol}.zip', stream=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Boreas514 So it seems that we download the full historical data from each data pull? Is there a way to query only the data from a specific time period? This make the get data process very slow for users 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the first time to use, it will download data from online and make local cache, after that set get_forex_data's parameter read_from_local=False so that it will read data from local.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with enzoampil that we should avoid the caching all forex data. We should cache only what is requested. Is there an easy implementation you can try?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makeing one symbol and one period cache in single process needs 15 minutes and 2GB RAM, is the cost acceptable?
A minimum of 2GB of memory is required to load 1 min chart of forex data from 2000 to 2018.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why such large memory is needed?
python/fastquant/data/forex/forex.py
Outdated
from fastquant.data.forex.forextester import get_forextester_data, get_local_data | ||
|
||
|
||
def get_forex_data(symbol, start_date=None, end_date=None, source="forextester", time_frame='D1', read_from_local=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also import get_forex_data
in the __init__.py
of the data
module (here).
README.md
Outdated
@@ -110,6 +112,39 @@ get_crypto_data("BTCUSDT", "2019-01-01", "2019-03-01") | |||
|
|||
*R does NOT have support for backtesting yet* | |||
|
|||
## Get forex data | |||
|
|||
All symbols from [forextester.com](https://forextester.com/) are accessible via `get_forex_data`.This mehtod fetch one minute data from source and shape it to other time frame, M1, M15, H1, D1, W1 are supported time frame now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All symbols from [forextester.com](https://forextester.com/) are accessible via `get_forex_data`.This mehtod fetch one minute data from source and shape it to other time frame, M1, M15, H1, D1, W1 are supported time frame now. | |
All symbols from [forextester.com](https://forextester.com/) are accessible via `get_forex_data`.This method fetch one minute data from source and shape it to other time frame, M1, M15, H1, D1, W1 are supported time frame now. |
README.md
Outdated
2020-08-30 22:00:00 1.1906 1.1918 1.1901 1.1916 | ||
[6127 rows x 5 columns] | ||
``` | ||
this process download online data, transfer it, save cache files in disk, first call will need 25 minutes or more and 7GB free RAM to generate cache files. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many of our users may not have 7GB free RAM so we cannot implement this. Let's try to cache only data the user actually querries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makeing one symbol and one period cache in single process needs 15 minutes and 2GB RAM, is the cost acceptable?
A minimum of 2GB of memory is required to load 1 min chart of forex data from 2000 to 2018.
README.md
Outdated
``` | ||
this process download online data, transfer it, save cache files in disk, first call will need 25 minutes or more and 7GB free RAM to generate cache files. | ||
|
||
After then, set `read_from_local` to True and you get a second response if don't need update local data to newest. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can read cache automatically when available else download so probably no need to add the read_from_local
flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only partial caching needs to change and we can merge.
I modified the function to reduce the hardware requirements to 2.3GB and a 20-minute conversion process, have a try. |
|
||
After then, set `read_from_local` to True and you get a second response if don't need update local data to newest. | ||
After then, set `read_from_local=Ture` and you will get a second response if don't need update local data to newest. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
read_from_local=True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No errors when I ran the branch in colab here: https://colab.research.google.com/drive/128gBYBjc4DcoS_QN3Kf4wdV1524vAjOA?hl=en#scrollTo=7sulhEFAWPl0
I like the logging functionality with ETA info.
But it is still very time consuming so we may need to implement faster solutions.
Thanks again for this @Boreas514 ! I agree with @jpdeleon , it's still kind of long. Is there a way that we can have a get function that follows the pattern of Keen to get your thoughts on this 😄 |
add get_forex_data, this method added some parameters not in stock model, details are in the comments of the method and readme