-
Notifications
You must be signed in to change notification settings - Fork 137
Restructured the operators source files and added support for multiple new operators. #70
Conversation
@rxwei What should I do about the |
|
In that case, why does the |
For historical reasons. GPE doesn't support variable-sized inputs/outputs, and |
In that case, I made a PR to add support for them in the |
I also addressed your comments, but will add the missing VJPs tomorrow because I have to finish up some other stuff tonight. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move the "explicit paddings" fix to a separate PR so that it can get merged before all the dependencies of this PR can. Thanks!
This compiles and passes all tests on my machine, when I follow the following process:
This process all works fine. The problem is when I try to compile this on its own using the built toolchain from above. That toolchain already contains all symbols from swift-apis in the stdlib, as it builds swift-apis together with the stdlib. This means that if I try to compile this with that toolchain I get many errors about ambiguous functions. |
Yeah, I understand the apple/swift path compiles and passes tests. To ensure swift-apis can be developed individually, we'll also need to make sure it works with Swift Package Manager. Earlier there were already quite a few operators and layers that were both defined in here and distributed as part of the toolchain. There were no ambiguity errors because overriding a symbol in a different module is okay. There may be some subtleties in the newly migrated APIs. It would be useful to see what exactly those ambiguities are. |
Let me know if you have any question via email or a new PR. I'm more than happy to help with those issues! |
I see, yes that makes sense. So, the output when compiling swift-apis looks as follows:
|
That's all actually so it should be easy to resolve. |
/// - Returns: Array containing the tensors parts. | ||
@inlinable | ||
@differentiable(vjp: _vjpSplit(numSplits:alongAxis:) where Scalar: TensorFlowFloatingPoint) | ||
func split(numSplits: Int, alongAxis axis: Int = 0) -> [Tensor] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed this newly added operator. Swift APIs usually don't use the "num" prefix to denote count, but use a Count
suffix instead. In this case, a single count
may also be fine. You could call this split(count:alongAxis:)
or follow the standard library precedent: split(maxSplits:omittingEmptySubsequences:whereSeparator:).
/// - Returns: Array containing the unstacked tensors. | ||
@inlinable | ||
@differentiable(vjp: _vjpUnstack(alongAxis:) where Scalar: TensorFlowFloatingPoint) | ||
func unstack(alongAxis axis: Int = 0) -> [Tensor] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functions that do not have side effects should be named like nouns (see Swift API Design Guidelines - Naming - Strive For Fluent Usage). Since this method doesn't have side effects, it a gerund (unstacking(alongAxis:)
) would be a better name.
I think that
Feel free to comment out those
I'm not sure either. Why is the one on 633 |
That fixes it!
Done.
This was just something I was trying. Could this error be because the |
I don't have a good solution to that. How about defining indexing logic in stdlib for now? |
That's ok for now but I also believe it is a more important issue that'll likely pop up again with other extensions. |
Actually I believe this might happen a lot in the future, given that a lot of the added functionality is in the form of extensions to |
Yeah, I think we should definitely get it resolved sooner rather than later. The Foundation model would be good to follow. Other people on the team have talked about moving TensorFlow out of the compiler repo as well (@pschuh @jekbradbury). |
I would be happy to help with that because even the changes introduced in this PR seem hard to properly handle (in that swift-apis cannot easily be compiled independently using S4TF). And yes, I would lean in favor of moving TensorFlow out of the compiler repo if that is an option. Could you please actually describe the Foundation model at a high level? |
Most importantly actually, I'm curious about how the deal with the ambiguous symbols issue that we are running into. |
It's really awesome to hear you are interested in helping out! Foundation sources are located in apple/swift-corelibs-foundation, and it's being pulled into the build directory by Foundation is a little different in that it's not a SwiftPM package, but I suspect that's not the difficult part. |
Are you referring to the subscript errors? I'd suggest pulling them back to stdlib for now. |
I've started using S4TF for some new research projects and would generally love to make the process of improving it and extending it smoother. I love the work you guys have put in and all that lies ahead! :) Regarding foundation, that sounds similar to what's happening now with swift-apis. I believe the main challenge (irrespective of using SwiftPM) is in compiling swift-apis as a standalone library using the S4TF compiler. I believe this would be much easier if we move all of the |
That's fantastic. We really really appreciate your effort in driving important changes as well as pushing the limits of the incomplete Swift for TensorFlow.
Right. This is why we probably shouldn't attempt this right now. In the TensorFlow module there is legacy code w.r.t. GPE, and also new experimental code built by @bgogul and @pschuh et al. One option is to keep I feel that tensor APIs being in the compiler repo really starts to limit our development. So it's probably good time to get the discussion started. How about setting up a meeting with the team to discuss this? Are you able to video chat some time this or next week? |
hello @eaplatanios. Quick question on this PR. Description claims the addition of initializers: Same goes for protocol |
Not sure about the new initializers. The ‘Optimizable’ protocol has been reverted, addressing this review feedback. |
@rxwei thanks.. missed those comments. |
Yes, definitely. That would be great! How about sometime on Wednesday? Also, I like the |
Sorry, yes we decided to remove the new ops from this PR to make it about moving stuff over from stdlib first. I have all the new ops implemented but will add them in a separate PR, after we sort out #109 . |
Got it! .. thanks! |
Reached out over email regarding the meeting! |
…e new operators. (tensorflow#70) * Re-organized the operators source files. * Added support for 'stacked', 'concatenated', 'gathered', 'batchGathered', and 'masked'. * Reverted back to 4-space tabs. * Made some other minor changes. * Added support or 'selecting'. * Added support for 'nonZeroIndices'. * Minor edits. * Addressed Richard's feedback. * Addressed Richard's comments. * Addressed Richard's comments. * Updated the convolution ops to support explicit paddings. * Small edits. * Updated the convolution ops to support explicit paddings. * Small fix. * Small fix. * Added a new tensor initializer from ranges of tensors. * Added documentation string for the "explicit" padding scheme. * More fixes. * Added 'zerosLike' and 'onesLike' tensor initializers. * Added a new 'stacking' tensor initializer and made some compatibility fixes. * Added a new 'tiling' tensor initializer. * Minor edit. * Made some refactoring. * Bug fix. * Added support for the split op and its VJP. * Added VJPs for stacking and tiling. * Added VJP for concatenating. * Added the gathering VJP. * Bug fixes. * Added an 'Optimizable' protocol. * Moved some more activation functions from the stdlib. * Added log-softmax VJP. * Minor bug fix. * Brought some initializers from stdlib. * Brought some more stuff from the stdlib. * Minor edit. * Moved some more stuff to swift-apis. * Removed all the newly-added ops. * Moved some more stuff to swift-apis. * Moved some more stuff to swift-apis. * Added a README file to the 'Operators' source directory. * Brought the gradient helper functions from the stdlib. * Bug fixes. * Brought the tensor tests from the stdlib. * Minor bug fix. * Addressed Richard's comments. * Minor edit. * Reverted the change in the existing optimizer implementations. * Added VJPs for some operations. * Incorporated fix from stdlib. * Addressed Richard's feedback. * Changed the indentation in the 'PythonConversion.swift' file. * Changed the indentation in the 'Random.swift' file. * Minor edit. * Tabs to spaces.
… multiple new operators. (tensorflow#70)" (tensorflow#108) This reverts commit d2c78f4.
This PR does some restructuring of the files where ops are defined and also adds the following:
Tensor(stacking:alongAxis:)
,Tensor(concatenating:alongAxis:)
, andTensor(tiling:multiples:)
, along with their VJPs. Also addsTensor(zerosLike:)
,Tensor(onesLike:)
, andTensor(rangeFrom:to:stride:)
with tensor-valued arguments.Tensor.unstack()
,Tensor.split(numSplits:alongAxis:)
,Tensor.split(sizes:alongAxis:)
,Tensor.gathering(atIndices:alongAxis:)
,sigmoid(_:)
,logSigmoid(_:)
, andsoftplus(_:)
along with their VJPs. Also addsTensor.batchGathering(atIndices:alongAxis:numBatchDims:)
,Tensor.gathering(where:alongAxis:)
, andTensor.nonZeroIndices()
.Optimizable
protocol that makes the optimizers more generally applicable, other than just for layers.Current issues:
precondition
which prevents me from making some of the new functions differentiable.This is a work-in-progress PR that also depends on the advanced indexing PR, but I want to start a discussion and gather feedback about the source files restructuring and also some missing op support. I actually have the following main questions:- Is the restructuring of the operators source files acceptable? I believe it will be hard to have all operators in a single source file.- I currently cannot add support for gradients ofstacked
andconcatenated
with multiple tensors because I see that#tfop
does not support ops that output lists of tensors. Are there any updates on this or any current workarounds to return arrays of tensors?- Is it ok to merge PRs that add support for new ops but not VJPs yet? I plan to add those to incrementally.This PR so far adds support for thestacked
,concatenated
,gathered
,batchGathered
,masked
, andnonZeroIndices
ops, as well as for a new tensor initializer from ranges defined over tensors, and newstacking
,concatenating
, andtiling
tensor initializers.