Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify tabular API #43

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Simplify tabular API #43

wants to merge 1 commit into from

Conversation

pelillian
Copy link
Contributor

No description provided.

@pelillian pelillian requested a review from a team as a code owner December 11, 2019 06:41
@codecov
Copy link

codecov bot commented Dec 11, 2019

Codecov Report

Merging #43 into master will decrease coverage by 1.18%.
The diff coverage is 97.41%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #43      +/-   ##
==========================================
- Coverage   94.13%   92.94%   -1.19%     
==========================================
  Files           7        7              
  Lines         358      326      -32     
  Branches       58       53       -5     
==========================================
- Hits          337      303      -34     
  Misses         12       12              
- Partials        9       11       +2
Impacted Files Coverage Δ
src/dowel/simple_outputs.py 96.92% <100%> (+2.37%) ⬆️
src/dowel/tensor_board_output.py 94.5% <100%> (-1.06%) ⬇️
src/dowel/logger.py 94.18% <100%> (+0.51%) ⬆️
src/dowel/csv_output.py 95.23% <91.66%> (-2.14%) ⬇️
src/dowel/tabular.py 94.11% <94.11%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a8b5134...be6887c. Read the comment docs.

@pelillian pelillian force-pushed the tabular_api branch 2 times, most recently from 0c88b1f to 9046c85 Compare December 11, 2019 07:24
return (str, ) + np.ScalarType

@property
def keys_accepted(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this should be a constructor arg for all log outputs. people might want to make loggers which subscribe to different streams of keys (this is one of the nice parts of using keys).

src/dowel/csv_output.py Outdated Show resolved Hide resolved
@@ -143,19 +144,28 @@ class LogOutput(abc.ABC):

@property
def types_accepted(self):
"""Pass these types to this logger output.
"""Pass these value types to this logger output.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use Google style docstrings. There's info in CONTRIBUTING.md

@pelillian pelillian requested review from ryanjulian and a team December 11, 2019 18:52
@pelillian pelillian force-pushed the tabular_api branch 3 times, most recently from 6dd057d to 79d0f31 Compare December 11, 2019 21:24
Add test for accepted types and keys

Finish API and update examples

Add keys_accepted to constructor and update docs

Update docs

Renamed tabular_input to tabular
@pelillian pelillian requested review from ryanjulian and removed request for ryanjulian December 12, 2019 22:40
@@ -20,8 +19,7 @@
'TextOutput',
'LogOutput',
'LoggerWarning',
'TabularInput',
'Tabular',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is now a private API, so it should not be in __all__, which is only for things people should be importing from your package.

self._writer = None
self._fieldnames = None
self._warned_once = set()
self._disable_warnings = False
self.tabular = Tabular()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_private, please

else:
raise ValueError('Unacceptable type.')
"""Accept str and scalar objects."""
return (str, ) + np.ScalarType
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just make this a tuple rather than using addition?

if to_csv.keys() != self._fieldnames:
self._warn('Inconsistent Tabular keys detected. '
'CsvOutput keys: {}. '
'Tabular keys: {}. '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

your user now has no idea that a Tabular is, so this message needs to be updated.

self._writer.writerow(to_csv)

self._log_file.flush()
self.tabular.clear()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think we should clear the table between calls to dump, because it allows us to provide a value even if someone doesn't update it. basically, if KV pairs are not all updated at the same rate it's okay, and we don't need to output an error.

@@ -5,17 +5,18 @@

The logger has 4 major steps:

1. Inputs, such as a simple string or something more complicated like
TabularInput, are passed to the log() method of an instantiated Logger.
1. Inputs, such as a simple string or something more complicated like
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make sure this giant commend renders nicely in sphinx. Here's what it looks like now:
https://dowel.readthedocs.io/en/latest/_apidoc/dowel.html#module-dowel.logger

Perhaps you can actually move this content (+example) to the title page of the documentation. If you do that, it will probably render fine (markdown is supported for pages but not docstrings).

Anyway, to render the docs just do

cd docs
make html
xdg-open _build/html/index.html  # if on macOS, use open instead

"""Pass logger data to this output.

:param data: The data to be logged by the output.
:param key: The key to be logged by the output.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use google style docstrings

@@ -195,24 +205,30 @@ def log(self, data):
Any data sent to this method is sent to all outputs that accept its
type (defined in the types_accepted property).

:param data: Data to be logged. This can be any type specified in the
:param key: Key to be logged. This must be a string.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update these docstrings to use Google style

:return: A tuple containing all valid input types.
"""
@property
def types_accepted(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should probably be an abc.abstractmethod rather than returning anything. that way, if a user tries to write a new LogOutput and forgets to implement this method, they will get an error message. Right now, their LogOutput will just reject all logs silently because the base class says so.

from dowel.utils import mkdir_p


class StdOutput(LogOutput):
"""Standard console output for the logger.

:param keys_accepted: Regex for which keys this output should accept.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update to use google style

"""Accept str and TabularInput objects."""
return (str, TabularInput)
"""Accept str and scalar objects."""
return (str, ) + np.ScalarType
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for addition when you can just write out the tuple.

self._with_timestamp = with_timestamp
self._delimiter = ' | '
self.tabular = Tabular()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be _private

"""Accept str objects only."""
return (str, TabularInput)
"""Accept str and scalar objects."""
return (str, ) + np.ScalarType
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use a literal rather than addition

out = '%s | %s' % (timestamp, out)
self._log_file.write(out + '\n')
else:
raise ValueError('Unacceptable type')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you update this to 'Type {} cannot be accepted by FileOutput'.format(type(value).__name__)

from tabulate import tabulate


class Tabular:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm so what does this do that a dict doesn't?

i see the as_primitive_type business, but didn't you already restrict the types that end up in this thing based on the types_accepted on your LogOutput? If an LogOutput can't output a FooType kv-pair, it will just not declare it in types_accepted.

@@ -46,10 +45,12 @@ class TensorBoardOutput(LogOutput):

def __init__(self,
log_dir,
keys_accepted=r'^\S+$',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason you didn't use r'^$' (which matches all strings)? I think TensorBoard will slugify any whitespace (not saying it's good practice, but I don't think it barfs if you do it).

scipy.stats._multivariate.multi_rv_frozen,
Histogram) + np.ScalarType
types = list(types)
types.remove(str)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is going on here? why can't this just be a tuple? how is str even in this tuple types when you just defined it as a literal (not including str) 2 lines above? if you did find a str in there for some reason, how did it get there?

if self._tf is None:
return (TabularInput, )
return types
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this exactly the same as?:

if self._tf is not None:
    types += (self._tf.Graph, )

return types

if so, use that form.

if self._tf is not None and isinstance(value, self._tf.Graph):
self._record_graph(value)
elif isinstance(value, self.types_accepted):
self._dict[key] = value
else:
raise ValueError('Unacceptable type.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please update this with the more informative message i provided above.

self._record_graph(data)
if self._tf is not None and isinstance(value, self._tf.Graph):
self._record_graph(value)
elif isinstance(value, self.types_accepted):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you refactor this as

if not isinstance(value, self.types_accepted):
    raise ValueError(...)

if self._tf is not None and isinstance(value, self._tf.Graph):
    self._record_graph(value)
else:
    self._dict[key] = value

else:
raise ValueError('Unacceptable type.')
self.tabular.record(key, value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wait, what if the value has a key but is an unacceptable type???

you can avoid this bug by using a guard clause

def record(self, key, value, prefix=''):
    if not isinstance(value, self.types_accepted):
        raise ValueError(...)

    if not key and isinstance(value, str):    # i'm not sure this needs to be a str. we could just print whatever you send us (that's what print does, anyway
        # ...
    else:
        self._tabular.record(key, value)

"""Flush data to log file."""
if not self.tabular.empty:
self._log_file.write(str(self.tabular) + '\n')
self.tabular.clear()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again i'm not sure tabular needs to be cleared here. not clearing it avoids an entire class of warnings/bugs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants