Developing with the Python SDK
Gradle can build and test python, and is used by the Jenkins jobs, so needs to be maintained.
You can directly use the Python toolchain instead of having Gradle orchestrate it, which may be faster for you, but it is your preference. If you do want to use Python tools directly, we recommend setting up a virtual environment before testing your code.
If you update any of the cythonized files in Python SDK, you must install the
cython package before running following command to properly test your code.
The following commands should be run in the
sdks/python directory. This installs Python from source and includes the test and gcp dependencies.
> c:\Python27\python.exe -m virtualenv
(env) > pip install -e .[gcp,test]
This command runs all Python tests. The nose dependency is installed by [test] in pip install.
(env) $ python setup.py nosetests
You can use following command to run a single test method.
(env) $ python setup.py nosetests --tests <module>:<test class>.<test method>
(env) $ python setup.py nosetests --tests apache_beam.io.textio_test:TextSourceTest.test_progress
You can deactivate the virtualenv when done.
(env) $ deactivate
To check just for Python lint errors, run the following command.
$ ../../gradlew lint
tox commands to run the lint tasks:
$ tox -e py27-lint # For python 2.7
$ tox -e py3-lint # For python 3
$ tox -e py27-lint3 # For python 2-3 compatibility
This step is only required for testing SDK code changes remotely (not using directrunner). In order to do this you must build the Beam tarball. From the root of the git repository, run:
$ cd sdks/python/
$ python setup.py sdist
--sdk_location flag to use the newly built version. For example:
$ python setup.py sdist > /dev/null && \
python -m apache_beam.examples.wordcount ... \