Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge master into develop #13271

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
030d63a
Merge pull request #13045 from explosion/master
rmitsch Oct 5, 2023
df07c47
Merge pull request #13046 from explosion/docs/llm_main
rmitsch Oct 5, 2023
c096c5c
Update for numpy 2.0 deprecations (#13103)
adrianeboyd Nov 6, 2023
ff9ddb6
Unskip python 3.12 remote tests (#13110)
adrianeboyd Nov 6, 2023
0c25725
Update Tokenizer.explain for special cases with whitespace (#13086)
adrianeboyd Nov 6, 2023
2b8da84
feat: add extra lexical attributes (#13106)
ridge-kimani Nov 8, 2023
513bbd5
Add preferred use of build for package CLI (#13109)
adrianeboyd Nov 8, 2023
b2e831d
LLM docs: OpenAI model update (#13119)
rmitsch Nov 8, 2023
bd2c17e
Warn about reloading dependencies after downloading models (#13081)
shadeMe Nov 10, 2023
9f2ce6b
Add Redfield NLP Nodes to the Spacy Universe (#13133)
ajbond Nov 17, 2023
b6e0223
Feature/nn and fo language extensions (#13116)
lise-brinck Nov 20, 2023
8f69e56
Add swag [ci skip]
ines Nov 20, 2023
bf7c2ea
Add merch link [ci skip]
ines Nov 22, 2023
da7ad97
Update `TextCatBOW` to use the fixed `SparseLinear` layer (#13149)
danieldk Nov 29, 2023
0e43fca
Add Claude-2.1 mention. (#13167)
rmitsch Dec 1, 2023
e467573
Docs: update trf_data examples and pipeline design info (#13164)
adrianeboyd Dec 4, 2023
55ed2b4
Add documentation for EL task (#12988)
rmitsch Dec 4, 2023
a25a3b9
Merge pull request #13173 from explosion/docs/llm_main
rmitsch Dec 4, 2023
9fcd2bf
Add info on endpoint arg. (#13169)
rmitsch Dec 5, 2023
f78b91c
Update links [ci skip]
ines Dec 11, 2023
8cfccdd
Update links [ci skip]
ines Dec 11, 2023
e79a9c5
Document `spacy-llm`'s `RawTask` (#13180)
rmitsch Dec 11, 2023
d56ee65
Document `spacy-llm`'s `TranslationTask` (#13183)
rmitsch Dec 11, 2023
7df328f
Update README.md [ci skip]
ines Dec 12, 2023
56fc3bc
Type documentation fixes for Doc (#13187)
svlandeg Dec 18, 2023
764be10
update README to include links to GPU processing, LLM's, and the spaC…
ojo4f3 Dec 18, 2023
7ebba86
Add TextCatReduce.v1 (#13181)
danieldk Dec 21, 2023
e2a3952
Add spacy.TextCatParametricAttention.v1 (#13201)
danieldk Jan 2, 2024
0062c22
Updated docs w.r.t. infinite doc length changes (#13214)
rmitsch Jan 5, 2024
c608bae
Fix typo in method name
mauricesvp Jan 16, 2024
91c24c0
Merge pull request #13251 from explosion/docs/llm_develop
rmitsch Jan 19, 2024
256468c
Merge branch 'docs/llm_main' into chore/sync-master-with-llm_main
rmitsch Jan 19, 2024
575c405
Fix LLM docs on task factories.
rmitsch Jan 19, 2024
3b3b5cd
Merge pull request #13253 from explosion/chore/sync-master-with-llm_main
rmitsch Jan 19, 2024
128197a
Properly clean up pipe multiprocessing workers (#13259)
danieldk Jan 23, 2024
5a2ad4a
Merge remote-tracking branch 'upstream/master' into patch-1
danieldk Jan 23, 2024
afac7fb
test_find_available_port: use port 5001 (#13255)
danieldk Jan 23, 2024
a8894a8
Merge pull request #13240 from mauricesvp/patch-1
danieldk Jan 23, 2024
a493981
fix typo (#13254)
svlandeg Jan 24, 2024
e1c1889
Merge remote-tracking branch 'upstream/master' into develop
danieldk Jan 25, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/FUNDING.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
custom: [https://explosion.ai/merch, https://explosion.ai/tailored-solutions]
11 changes: 9 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,28 +39,35 @@ open-source software, released under the
| 🚀 **[New in v3.0]** | New features, backwards incompatibilities and migration guide. |
| 🪐 **[Project Templates]** | End-to-end workflows you can clone, modify and run. |
| 🎛 **[API Reference]** | The detailed reference for spaCy's API. |
| ⏩ **[GPU Processing]** | Use spaCy with CUDA-compatible GPU processing. |
| 📦 **[Models]** | Download trained pipelines for spaCy. |
| 🦙 **[Large Language Models]** | Integrate LLMs into spaCy pipelines. |
| 🌌 **[Universe]** | Plugins, extensions, demos and books from the spaCy ecosystem. |
| ⚙️ **[spaCy VS Code Extension]** | Additional tooling and features for working with spaCy's config files. |
| 👩‍🏫 **[Online Course]** | Learn spaCy in this free and interactive online course. |
| 📰 **[Blog]** | Read about current spaCy and Prodigy development, releases, talks and more from Explosion. |
| 📺 **[Videos]** | Our YouTube channel with video tutorials, talks and more. |
| 🛠 **[Changelog]** | Changes and version history. |
| 💝 **[Contribute]** | How to contribute to the spaCy project and code base. |
| <a href="https://explosion.ai/spacy-tailored-pipelines"><img src="https://user-images.githubusercontent.com/13643239/152853098-1c761611-ccb0-4ec6-9066-b234552831fe.png" width="125" alt="spaCy Tailored Pipelines"/></a> | Get a custom spaCy pipeline, tailor-made for your NLP problem by spaCy's core developers. Streamlined, production-ready, predictable and maintainable. Start by completing our 5-minute questionnaire to tell us what you need and we'll be in touch! **[Learn more &rarr;](https://explosion.ai/spacy-tailored-pipelines)** |
| <a href="https://explosion.ai/spacy-tailored-analysis"><img src="https://user-images.githubusercontent.com/1019791/206151300-b00cd189-e503-4797-aa1e-1bb6344062c5.png" width="125" alt="spaCy Tailored Pipelines"/></a> | Bespoke advice for problem solving, strategy and analysis for applied NLP projects. Services include data strategy, code reviews, pipeline design and annotation coaching. Curious? Fill in our 5-minute questionnaire to tell us what you need and we'll be in touch! **[Learn more &rarr;](https://explosion.ai/spacy-tailored-analysis)** |
| 👕 **[Swag]** | Support us and our work with unique, custom-designed swag! |
| <a href="https://explosion.ai/tailored-solutions"><img src="https://github.com/explosion/spaCy/assets/13643239/36d2a42e-98c0-4599-90e1-788ef75181be" width="150" alt="Tailored Solutions"/></a> | Custom NLP consulting, implementation and strategic advice by spaCy’s core development team. Streamlined, production-ready, predictable and maintainable. Send us an email or take our 5-minute questionnaire, and well'be in touch! **[Learn more &rarr;](https://explosion.ai/tailored-solutions)** |

[spacy 101]: https://spacy.io/usage/spacy-101
[new in v3.0]: https://spacy.io/usage/v3
[usage guides]: https://spacy.io/usage/
[api reference]: https://spacy.io/api/
[gpu processing]: https://spacy.io/usage#gpu
[models]: https://spacy.io/models
[large language models]: https://spacy.io/usage/large-language-models
[universe]: https://spacy.io/universe
[spacy vs code extension]: https://github.com/explosion/spacy-vscode
[videos]: https://www.youtube.com/c/ExplosionAI
[online course]: https://course.spacy.io
[blog]: https://explosion.ai
[project templates]: https://github.com/explosion/projects
[changelog]: https://spacy.io/usage#changelog
[contribute]: https://github.com/explosion/spaCy/blob/master/CONTRIBUTING.md
[swag]: https://explosion.ai/merch

## 💬 Where to ask questions

Expand Down
42 changes: 42 additions & 0 deletions licenses/3rd_party_licenses.txt
Original file line number Diff line number Diff line change
Expand Up @@ -158,3 +158,45 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


SciPy
-----

* Files: scorer.py

The implementation of trapezoid() is adapted from SciPy, which is distributed
under the following license:

New BSD License

Copyright (c) 2001-2002 Enthought, Inc. 2003-2023, SciPy Developers.
All rights reserved.

Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions
are met:

1. Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.

2. Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials provided
with the distribution.

3. Neither the name of the copyright holder nor the names of its
contributors may be used to endorse or promote products derived
from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ requires = [
"cymem>=2.0.2,<2.1.0",
"preshed>=3.0.2,<3.1.0",
"murmurhash>=0.28.0,<1.1.0",
"thinc>=8.1.8,<8.3.0",
"thinc>=8.2.2,<8.3.0",
"numpy>=1.15.0; python_version < '3.9'",
"numpy>=1.25.0; python_version >= '3.9'",
]
Expand Down
2 changes: 1 addition & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ spacy-legacy>=3.0.11,<3.1.0
spacy-loggers>=1.0.0,<2.0.0
cymem>=2.0.2,<2.1.0
preshed>=3.0.2,<3.1.0
thinc>=8.1.8,<8.3.0
thinc>=8.2.2,<8.3.0
ml_datasets>=0.2.0,<0.3.0
murmurhash>=0.28.0,<1.1.0
wasabi>=0.9.1,<1.2.0
Expand Down
4 changes: 2 additions & 2 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -41,15 +41,15 @@ setup_requires =
cymem>=2.0.2,<2.1.0
preshed>=3.0.2,<3.1.0
murmurhash>=0.28.0,<1.1.0
thinc>=8.1.8,<8.3.0
thinc>=8.2.2,<8.3.0
install_requires =
# Our libraries
spacy-legacy>=3.0.11,<3.1.0
spacy-loggers>=1.0.0,<2.0.0
murmurhash>=0.28.0,<1.1.0
cymem>=2.0.2,<2.1.0
preshed>=3.0.2,<3.1.0
thinc>=8.1.8,<8.3.0
thinc>=8.2.2,<8.3.0
wasabi>=0.9.1,<1.2.0
srsly>=2.4.3,<3.0.0
catalogue>=2.0.6,<2.1.0
Expand Down
30 changes: 29 additions & 1 deletion spacy/cli/download.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,14 @@

from .. import about
from ..errors import OLD_MODEL_SHORTCUTS
from ..util import get_minor_version, is_package, is_prerelease_version, run_command
from ..util import (
get_minor_version,
is_in_interactive,
is_in_jupyter,
is_package,
is_prerelease_version,
run_command,
)
from ._util import SDIST_SUFFIX, WHEEL_SUFFIX, Arg, Opt, app


Expand Down Expand Up @@ -77,6 +84,27 @@ def download(
"Download and installation successful",
f"You can now load the package via spacy.load('{model_name}')",
)
if is_in_jupyter():
reload_deps_msg = (
"If you are in a Jupyter or Colab notebook, you may need to "
"restart Python in order to load all the package's dependencies. "
"You can do this by selecting the 'Restart kernel' or 'Restart "
"runtime' option."
)
msg.warn(
"Restart to reload dependencies",
reload_deps_msg,
)
elif is_in_interactive():
reload_deps_msg = (
"If you are in an interactive Python session, you may need to "
"exit and restart Python to load all the package's dependencies. "
"You can exit with Ctrl-D (or Ctrl-Z and Enter on Windows)."
)
msg.warn(
"Restart to reload dependencies",
reload_deps_msg,
)


def get_model_filename(model_name: str, version: str, sdist: bool = False) -> str:
Expand Down
59 changes: 53 additions & 6 deletions spacy/cli/package.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
import os
import re
import shutil
import subprocess
import sys
from collections import defaultdict
from pathlib import Path
Expand All @@ -11,6 +13,7 @@
from wasabi import MarkdownRenderer, Printer, get_raw_input

from .. import about, util
from ..compat import importlib_metadata
from ..schemas import ModelMetaSchema, validate
from ._util import SDIST_SUFFIX, WHEEL_SUFFIX, Arg, Opt, app, string_to_list

Expand All @@ -35,7 +38,7 @@ def package_cli(
specified output directory, and the data will be copied over. If
--create-meta is set and a meta.json already exists in the output directory,
the existing values will be used as the defaults in the command-line prompt.
After packaging, "python setup.py sdist" is run in the package directory,
After packaging, "python -m build --sdist" is run in the package directory,
which will create a .tar.gz archive that can be installed via "pip install".

If additional code files are provided (e.g. Python files containing custom
Expand Down Expand Up @@ -78,9 +81,17 @@ def package(
input_path = util.ensure_path(input_dir)
output_path = util.ensure_path(output_dir)
meta_path = util.ensure_path(meta_path)
if create_wheel and not has_wheel():
err = "Generating a binary .whl file requires wheel to be installed"
msg.fail(err, "pip install wheel", exits=1)
if create_wheel and not has_wheel() and not has_build():
err = (
"Generating wheels requires 'build' or 'wheel' (deprecated) to be installed"
)
msg.fail(err, "pip install build", exits=1)
if not has_build():
msg.warn(
"Generating packages without the 'build' package is deprecated and "
"will not be supported in the future. To install 'build': pip "
"install build"
)
if not input_path or not input_path.exists():
msg.fail("Can't locate pipeline data", input_path, exits=1)
if not output_path or not output_path.exists():
Expand Down Expand Up @@ -184,12 +195,37 @@ def package(
msg.good(f"Successfully created package directory '{model_name_v}'", main_path)
if create_sdist:
with util.working_dir(main_path):
util.run_command([sys.executable, "setup.py", "sdist"], capture=False)
# run directly, since util.run_command is not designed to continue
# after a command fails
ret = subprocess.run(
[sys.executable, "-m", "build", ".", "--sdist"],
env=os.environ.copy(),
)
if ret.returncode != 0:
msg.warn(
"Creating sdist with 'python -m build' failed. Falling "
"back to deprecated use of 'python setup.py sdist'"
)
util.run_command([sys.executable, "setup.py", "sdist"], capture=False)
zip_file = main_path / "dist" / f"{model_name_v}{SDIST_SUFFIX}"
msg.good(f"Successfully created zipped Python package", zip_file)
if create_wheel:
with util.working_dir(main_path):
util.run_command([sys.executable, "setup.py", "bdist_wheel"], capture=False)
# run directly, since util.run_command is not designed to continue
# after a command fails
ret = subprocess.run(
[sys.executable, "-m", "build", ".", "--wheel"],
env=os.environ.copy(),
)
if ret.returncode != 0:
msg.warn(
"Creating wheel with 'python -m build' failed. Falling "
"back to deprecated use of 'wheel' with "
"'python setup.py bdist_wheel'"
)
util.run_command(
[sys.executable, "setup.py", "bdist_wheel"], capture=False
)
wheel_name_squashed = re.sub("_+", "_", model_name_v)
wheel = main_path / "dist" / f"{wheel_name_squashed}{WHEEL_SUFFIX}"
msg.good(f"Successfully created binary wheel", wheel)
Expand All @@ -209,6 +245,17 @@ def has_wheel() -> bool:
return False


def has_build() -> bool:
# it's very likely that there is a local directory named build/ (especially
# in an editable install), so an import check is not sufficient; instead
# check that there is a package version
try:
importlib_metadata.version("build")
return True
except importlib_metadata.PackageNotFoundError: # type: ignore[attr-defined]
return False


def get_third_party_dependencies(
config: Config, exclude: List[str] = util.SimpleFrozenList()
) -> List[str]:
Expand Down
Loading
Loading