Skip to content

TST: Enable Google BigQuery (pandas.io.gbq) integration testing #11089 #14111

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

parthea
Copy link
Contributor

@parthea parthea commented Aug 28, 2016

@jreback
Copy link
Contributor

jreback commented Aug 28, 2016

whoo hoo!

only thing, the attached encrypted json file is decrpyted by the obfuscated travis variables right? IOW it is ONLY usable on travis

@jreback jreback added IO Google Testing pandas testing functions or related to the test suite labels Aug 28, 2016
@jreback jreback added this to the 0.19.0 milestone Aug 28, 2016
@parthea parthea changed the title Enable Google BigQuery (pandas.io.gbq) integration testing #11089 TST: Enable Google BigQuery (pandas.io.gbq) integration testing #11089 Aug 28, 2016
@parthea
Copy link
Contributor Author

parthea commented Aug 28, 2016

only thing, the attached encrypted json file is decrpyted by the obfuscated travis variables right? IOW it is ONLY usable on travis

Correct! The json credentials file should only be decrypted on Travis. The private key is tied to the repository and only available in the Travis environment.

@parthea
Copy link
Contributor Author

parthea commented Aug 28, 2016

@jreback I'm not sure if I have access to encrypt the file for the pydata/pandas repo (I used my forked repo). Are you able to run travis encrypt-file travis_gbq.json in the root directory of pydata/pandas ?

You'll need to rename the credentials file (previously sent via email) as travis_gbq.json from the root directory of pydata/pandas, but make sure not to commit travis_gbq.json. Only commit the travis_gbq.json.enc file.

Also, please modify line 232 in .travis.yml with the correct name of the environment variables on travis which contain the private key, which are provided after running travis encrypt-file ...

Here is the error I get when I try to create travis_gbq.json.enc:

tony@tonypc:~/pandas-getencr$ travis encrypt-file travis_gbq.json
encrypting travis_gbq.json for pydata/pandas
storing result as travis_gbq.json.enc
storing secure env variables for decryption
resource not found ({"error":"Couldn't find repository"})

I didn't receive this error on my forked repo. It may work if you run it

@jreback
Copy link
Contributor

jreback commented Aug 28, 2016

did a PR to your branch. I moved the files to ci/ but the keys should work. lmk if you need anything.

@parthea parthea force-pushed the enable-gbq-unit-tests branch 6 times, most recently from 6df7f6e to 89e7091 Compare August 28, 2016 15:17
@parthea
Copy link
Contributor Author

parthea commented Aug 28, 2016

I used the changes from parthea#1 in this PR and I receive the following error:

> openssl aes-256-cbc -K $encrypted_1d9d7b1f171b_key -iv $encrypted_1d9d7b1f171b_iv -in ci/travis_gbq.json.enc -out ci/travis_gbq.json -d

iv undefined

I believe there is a security restriction on pull requests from forked repos as mentioned here:
https://docs.travis-ci.com/user/pull-requests#Pull-Requests-and-Security-Restrictions

The most important restriction for pull requests is about secure environment variables and encrypted data.

A pull request sent from a fork of the upstream repository could be manipulated to expose any environment variables. The upstream repository’s maintainer would have no protection against this attack, as pull requests can be sent by anyone with a fork.

Travis CI makes encrypted variables and data available only to pull requests coming from the same repository. These are considered trustworthy, as only members with write access to the repository can send them.

Pull requests sent from forked repositories do not have access to encrypted variables or data.

Can I show a successful pass on my forked repo with my own encrypted file ?
I updated the credentials file so that the tests pass in my forked repo. Here is a link to the test in progress: https://travis-ci.org/parthea/pandas/builds/155756133

@parthea
Copy link
Contributor Author

parthea commented Aug 28, 2016

Are these errors expected?

======================================================================
ERROR: pandas.io.tests.test_pickle.TestPickle.test_pickles('0.17.0',)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/pandas/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/build/parthea/pandas/pandas/io/tests/test_pickle.py", line 174, in read_pickles
    data = self.compare(vf, version)
  File "/home/travis/build/parthea/pandas/pandas/io/tests/test_pickle.py", line 75, in compare
    expected = self.data[typ][dt]
KeyError: 'period'
======================================================================
ERROR: pandas.io.tests.test_pickle.TestPickle.test_pickles('0.17.1',)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/pandas/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/build/parthea/pandas/pandas/io/tests/test_pickle.py", line 174, in read_pickles
    data = self.compare(vf, version)
  File "/home/travis/build/parthea/pandas/pandas/io/tests/test_pickle.py", line 75, in compare
    expected = self.data[typ][dt]
KeyError: 'period'
======================================================================
ERROR: pandas.io.tests.test_pickle.TestPickle.test_pickles('0.18.0',)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/pandas/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/build/parthea/pandas/pandas/io/tests/test_pickle.py", line 174, in read_pickles
    data = self.compare(vf, version)
  File "/home/travis/build/parthea/pandas/pandas/io/tests/test_pickle.py", line 75, in compare
    expected = self.data[typ][dt]
KeyError: 'period'
======================================================================
ERROR: pandas.io.tests.test_pickle.TestPickle.test_pickles('0.18.1',)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/miniconda/envs/pandas/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/travis/build/parthea/pandas/pandas/io/tests/test_pickle.py", line 174, in read_pickles
    data = self.compare(vf, version)
  File "/home/travis/build/parthea/pandas/pandas/io/tests/test_pickle.py", line 75, in compare
    expected = self.data[typ][dt]
KeyError: 'period'
----------------------------------------------------------------------

@jreback
Copy link
Contributor

jreback commented Aug 28, 2016

u need to push tags to your master branch

@parthea
Copy link
Contributor Author

parthea commented Aug 29, 2016

u need to push tags to your master branch

@jreback Thanks! That was it.

All tests pass in this PR from my forked repo.

Only the first commit (89e709154b0417df163c93d33c3f3aecea624551) should be merged . The second commit contains a credentials file which was encrypted from my forked repo (55821a65da69193e8e471944e9b0074474b290d8) and won't be required.

If you want to see that Travis is green prior to merging, one solution is to create a branch from pydata/pandas and merge commit (89e709154b0417df163c93d33c3f3aecea624551) into that branch. I expect the correct (pydata/pandas) credentials will work since the branch is not from a different(forked) repo.

@parthea
Copy link
Contributor Author

parthea commented Aug 29, 2016

@jreback Ready for review. Travis is green at parthea#2

@jreback
Copy link
Contributor

jreback commented Aug 29, 2016

ok the first commit contains the correct credentials?

@parthea
Copy link
Contributor Author

parthea commented Aug 29, 2016

Yes. The first commit contains the correct credentials from parthea#1

Only merge the first commit. If its not too much trouble, you could create a PR from pydata/pandas and merge only the first commit just to make sure. The credentials should work since the PR branch is not from a forked repo.

@jreback
Copy link
Contributor

jreback commented Aug 29, 2016

np

I am going to (later) push this up as a PR on master itself so should build

@jreback
Copy link
Contributor

jreback commented Aug 29, 2016

the only thing is we may need to put in a check to skip trying to get these credentials when not on the master branch

@parthea parthea force-pushed the enable-gbq-unit-tests branch from 55821a6 to c0543c7 Compare August 30, 2016 02:20
@parthea parthea force-pushed the enable-gbq-unit-tests branch from c0543c7 to 1a13487 Compare August 30, 2016 03:06
@codecov-io
Copy link

codecov-io commented Aug 30, 2016

Current coverage is 85.27% (diff: 100%)

Merging #14111 into master will not change coverage

@@             master     #14111   diff @@
==========================================
  Files           139        139          
  Lines         50511      50511          
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
  Hits          43071      43071          
  Misses         7440       7440          
  Partials          0          0          

Powered by Codecov. Last update 10bf721...1a13487

@parthea
Copy link
Contributor Author

parthea commented Aug 30, 2016

the only thing is we may need to put in a check to skip trying to get these credentials when not on the master branch

Done

Tests are skipped when the encryption key is not available on Travis. See https://travis-ci.org/pydata/pandas/builds/156106141

----------------------------------------------------------------------------------------------------------------------------------------------------------------------
#25 pandas.io.tests.test_gbq.GBQUnitTests.test_read_gbq_with_corrupted_private_key_json_should_fail: Cannot run integration tests without a private key json file path
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
#26 pandas.io.tests.test_gbq.TestGBQConnectorIntegration.test_get_application_default_credentials_does_not_throw_error: Cannot run integration tests without a project id
#27 pandas.io.tests.test_gbq.TestGBQConnectorIntegration.test_get_application_default_credentials_returns_credentials: Cannot run integration tests without a project id
#28 pandas.io.tests.test_gbq.TestGBQConnectorIntegration.test_should_be_able_to_get_a_bigquery_service: Cannot run integration tests without a project id
#29 pandas.io.tests.test_gbq.TestGBQConnectorIntegration.test_should_be_able_to_get_results_from_query: Cannot run integration tests without a project id
#30 pandas.io.tests.test_gbq.TestGBQConnectorIntegration.test_should_be_able_to_get_schema_from_query: Cannot run integration tests without a project id
#31 pandas.io.tests.test_gbq.TestGBQConnectorIntegration.test_should_be_able_to_get_valid_credentials: Cannot run integration tests without a project id
#32 pandas.io.tests.test_gbq.TestGBQConnectorIntegration.test_should_be_able_to_make_a_connector: Cannot run integration tests without a project id
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
#33 pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyContentsIntegration.test_should_be_able_to_get_a_bigquery_service: Cannot run integration tests without a project id
#34 pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyContentsIntegration.test_should_be_able_to_get_results_from_query: Cannot run integration tests without a project id
#35 pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyContentsIntegration.test_should_be_able_to_get_schema_from_query: Cannot run integration tests without a project id
#36 pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyContentsIntegration.test_should_be_able_to_get_valid_credentials: Cannot run integration tests without a project id
#37 pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyContentsIntegration.test_should_be_able_to_make_a_connector: Cannot run integration tests without a project id
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
#38 pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyPathIntegration.test_should_be_able_to_get_a_bigquery_service: Cannot run integration tests without a project id
#39 pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyPathIntegration.test_should_be_able_to_get_results_from_query: Cannot run integration tests without a project id
#40 pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyPathIntegration.test_should_be_able_to_get_schema_from_query: Cannot run integration tests without a project id
#41 pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyPathIntegration.test_should_be_able_to_get_valid_credentials: Cannot run integration tests without a project id
#42 pandas.io.tests.test_gbq.TestGBQConnectorServiceAccountKeyPathIntegration.test_should_be_able_to_make_a_connector: Cannot run integration tests without a project id
---------------------------------------------------------------------------------------------------------------------
#43 <nose.suite.ContextSuite context=TestReadGBQIntegration>:setup: Cannot run integration tests without a project id
#44 <nose.suite.ContextSuite context=TestToGBQIntegration>:setup: Cannot run integration tests without a project id
#45 <nose.suite.ContextSuite context=TestToGBQIntegrationServiceAccountKeyContents>:setup: Cannot run integration tests without a project id
#46 <nose.suite.ContextSuite context=TestToGBQIntegrationServiceAccountKeyPath>:setup: Cannot run integration tests without a project id

@jreback jreback closed this in f92cd7e Aug 31, 2016
@jreback
Copy link
Contributor

jreback commented Aug 31, 2016

@parthea ok I pushed up that 1st commit.

pls check out the build and see if it works. ideally we could have some instructions for users to test on their forks as well ....any ideas?

@jreback
Copy link
Contributor

jreback commented Aug 31, 2016

@parthea I am not sure this worked: https://travis-ci.org/pydata/pandas/jobs/156506548

@parthea
Copy link
Contributor Author

parthea commented Aug 31, 2016

It appears to be working. The following 2 tests are supposed to be skipped:

pandas.io.tests.test_gbq.TestGBQConnectorIntegration.test_get_application_default_credentials_returns_credentials: Cannot get default_credentials from the environment!
---------------------------------------------------------------------------------------------------------------------------------
#43 pandas.io.tests.test_gbq.TestReadGBQIntegration.test_should_read_as_user_account: Cannot run local auth in travis environment
--------------------------------------------------------------------------------------------------

The first test only works with a Google Environment, and the 2nd test can't be automated (we use a service account).

I'll look into adding a PR to update the contributing documentation.

@jreback
Copy link
Contributor

jreback commented Aug 31, 2016

ahh ok great!. So at the very least a change that doesn't pass the integration tests will at least fail master!. great!

@parthea
Copy link
Contributor Author

parthea commented Sep 3, 2016

I've added instructions for users to run Google BigQuery integration tests on their forks in #14144

@parthea parthea deleted the enable-gbq-unit-tests branch September 4, 2016 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BigQuery integration tests are skipped on travis
3 participants