[trove][all][tc] A proposal to rearchitect Trove

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
42 messages Options
123
Reply | Threaded
Open this post in threaded view
|

[trove][all][tc] A proposal to rearchitect Trove

Amrith Kumar-2
Trove has evolved rapidly over the past several years, since integration in IceHouse when it only supported single instances of a few databases. Today it supports a dozen databases including clusters and replication.

The user survey [1] indicates that while there is strong interest in the project, there are few large production deployments that are known of (by the development team).

Recent changes in the OpenStack community at large (company realignments, acquisitions, layoffs) and the Trove community in particular, coupled with a mounting burden of technical debt have prompted me to make this proposal to re-architect Trove.

This email summarizes several of the issues that face the project, both structurally and architecturally. This email does not claim to include a detailed specification for what the new Trove would look like, merely the recommendation that the community should come together and develop one so that the project can be sustainable and useful to those who wish to use it in the future.

TL;DR

Trove, with support for a dozen or so databases today, finds itself in a bind because there are few developers, and a code-base with a significant amount of technical debt.

Some architectural choices which the team made over the years have consequences which make the project less than ideal for deployers.

Given that there are no major production deployments of Trove at present, this provides us an opportunity to reset the project, learn from our v1 and come up with a strong v2.

An important aspect of making this proposal work is that we seek to eliminate the effort (planning, and coding) involved in migrating existing Trove v1 deployments to the proposed Trove v2. Effectively, with work beginning on Trove v2 as proposed here, Trove v1 as released with Pike will be marked as deprecated and users will have to migrate to Trove v2 when it becomes available.

While I would very much like to continue to support the users on Trove v1 through this transition, the simple fact is that absent community participation this will be impossible. Furthermore, given that there are no production deployments of Trove at this time, it seems pointless to build that upgrade path from Trove v1 to Trove v2; it would be the proverbial bridge from nowhere.

This (previous) statement is, I realize, contentious. There are those who have told me that an upgrade path must be provided, and there are those who have told me of unnamed deployments of Trove that would suffer. To this, all I can say is that if an upgrade path is of value to you, then please commit the development resources to participate in the community to make that possible. But equally, preventing a v2 of Trove or delaying it will only make the v1 that we have today less valuable.

We have learned a lot from v1, and the hope is that we can address that in v2. Some of the more significant things that I have learned are:

- We should adopt a versioned front-end API from the very beginning; making the REST API versioned is not a ‘v2 feature’

- A guest agent running on a tenant instance, with connectivity to a shared management message bus is a security loophole; encrypting traffic, per-tenant-passwords, and any other scheme is merely lipstick on a security hole

- Reliance on Nova for compute resources is fine, but dependence on Nova VM specific capabilities (like instance rebuild) is not; it makes things like containers or bare-metal second class citizens

- A fair portion of what Trove does is resource orchestration; don’t reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as far along when Trove got started but that’s not the case today and we have an opportunity to fix that now

- A similarly significant portion of what Trove does is to implement a state-machine that will perform specific workflows involved in implementing database specific operations. This makes the Trove taskmanager a stateful entity. Some of the operations could take a fair amount of time. This is a serious architectural flaw

- Tenants should not ever be able to directly interact with the underlying storage and compute used by database instances; that should be the default configuration, not an untested deployment alternative

- The CI should test all databases that are considered to be ‘supported’ without excessive use of resources in the gate; better code modularization will help determine the tests which can safely be skipped in testing changes

- Clusters should be first class citizens not an afterthought, single instance databases may be the ‘special case’, not the other way around

- The project must provide guest images (or at least complete tooling for deployers to build these); while the project can’t distribute operating systems and database software, the current deployment model merely impedes adoption

- Clusters spanning OpenStack deployments are a real thing that must be supported

This might sound harsh, that isn’t the intent. Each of these is the consequence of one or more perfectly rational decisions. Some of those decisions have had unintended consequences, and others were made knowing that we would be incurring some technical debt; debt we have not had the time or resources to address. Fixing all these is not impossible, it just takes the dedication of resources by the community.

I do not have a complete design for what the new Trove would look like. For example, I don’t know how we will interact with other projects (like Heat). Many questions remain to be explored and answered.

Would it suffice to just use the existing Heat resources and build templates around those, or will it be better to implement custom Trove resources and then orchestrate things based on those resources?

Would Trove implement the workflows required for multi-stage database operations by itself, or would it rely on some other project (say Mistral) for this? Is Mistral really a workflow service, or just cron on steroids? I don’t know the answer but I would like to find out.

While we don’t have the answers to these questions, I think this is a conversation that we must have, one that we must decide on, and then as a community commit the resources required to make a Trove v2 which delivers on the mission of the project; “To provide scalable and reliable Cloud Database as a Service provisioning functionality for both relational and non-relational database engines, and to continue to improve its fully-featured and extensible open source framework.”[2]

Thanks,

-amrith​


[1] https://www.openstack.org/assets/survey/April2017SurveyReport.pdf
[2] https://wiki.openstack.org/wiki/Trove#Mission_Statement



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [trove][all][tc] A proposal to rearchitect Trove

Thierry Carrez
Amrith Kumar wrote:

> [...]
> An important aspect of making this proposal work is that we seek to
> eliminate the effort (planning, and coding) involved in migrating
> existing Trove v1 deployments to the proposed Trove v2. Effectively,
> with work beginning on Trove v2 as proposed here, Trove v1 as released
> with Pike will be marked as deprecated and users will have to migrate to
> Trove v2 when it becomes available.
>
> While I would very much like to continue to support the users on Trove
> v1 through this transition, the simple fact is that absent community
> participation this will be impossible. Furthermore, given that there are
> no production deployments of Trove at this time, it seems pointless to
> build that upgrade path from Trove v1 to Trove v2; it would be the
> proverbial bridge from nowhere.
> [...]
From an OpenStack project naming perspective, IMHO the line between a
"v2" and a completely new project (with a new name) is whether you
provide an upgrade path. I feel like if you won't support v1 users at
all (and I understand the reasons why you wouldn't), the new project
should not be called "Trove v2", but "Hoard". I don't really want to set
a precedent of breaking users by restarting from scratch and calling it
"v2", while everywhere else we encourage projects to never break their
users.

In all cases, providing offline tooling to migrate your Trove resources
to Hoard equivalents would be a nice plus, but I'd say that this tooling
is likely to appear if there is a need. Just be receptive to the idea of
adding that in a tools/ directory :)

--
Thierry Carrez (ttx)

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [trove][all][tc] A proposal to rearchitect Trove

Fox, Kevin M
In reply to this post by Amrith Kumar-2
Thanks for starting this difficult discussion.

I think I agree with all the lessons learned except  the nova one. while you can treat containers and vm's the same, after years of using both, I really don't think its a good idea to treat them equally. Containers can't work properly if used as a vm. (really, really.)

I agree whole heartedly with your statement that its mostly an orchestration problem and should reuse stuff now that there are options.

I would propose the following that I think meets your goals and could widen your contributor base substantially:

Look at the Kubernetes (k8s) concept of Operator -> https://coreos.com/blog/introducing-operators.html

They allow application specific logic to be added to Kubernetes while reusing the rest of k8s to do what its good at. Container Orchestration. etcd is just a clustered database and if the operator concept works for it, it should also work for other databases such as Gallera.

Where I think the containers/vm thing is incompatible is the thing I think will make Trove's life easier. You can think of a member of the database as few different components, such as:
 * main database process
 * metrics gatherer (such as https://github.com/prometheus/mysqld_exporter)
 * trove_guest_agent

With the current approach, all are mixed into the same vm image, making it very difficult to update the trove_guest_agent without touching the main database process. (needed when you upgrade the trove controllers). With the k8s sidecar concept, each would be a separate container loaded into the same pod.

So rather then needing to maintain a trove image for every possible combination of db version, trove version, etc, you can reuse upstream database containers along with trove provided guest agents.

There's a secure channel between kube-apiserver and kubelet so you can reuse it for secure communications. No need to add anything for secure communication. trove engine -> kubectl exec xxxxx-db -c guest_agent some command.

There is k8s federation, so if the operator was started at the federation level, it can cross multiple OpenStack regions.

Another big feature I that hasn't been mentioned yet that I think is critical. In our performance tests, databases in VM's have never performed particularly well. Using k8s as a base, bare metal nodes could be pulled in easily, with dedicated disk or ssd's that the pods land on that are very very close to the database. This should give native performance.

So, my suggestion would be to strongly consider basing Trove v2 on Kubernetes. It can provide a huge bang for the buck, simplifying the Trove architecture substantially while gaining the new features your list as being important. The Trove v2 OpenStack api can be exposed as a very thin wrapper over k8s Third Party Resources (TPR) and would make Trove entirely stateless. k8s maintains all state for everything in etcd.

Please consider this architecture.

Thanks,
Kevin


From: Amrith Kumar [[hidden email]]
Sent: Sunday, June 18, 2017 4:35 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove

Trove has evolved rapidly over the past several years, since integration in IceHouse when it only supported single instances of a few databases. Today it supports a dozen databases including clusters and replication.

The user survey [1] indicates that while there is strong interest in the project, there are few large production deployments that are known of (by the development team).

Recent changes in the OpenStack community at large (company realignments, acquisitions, layoffs) and the Trove community in particular, coupled with a mounting burden of technical debt have prompted me to make this proposal to re-architect Trove.

This email summarizes several of the issues that face the project, both structurally and architecturally. This email does not claim to include a detailed specification for what the new Trove would look like, merely the recommendation that the community should come together and develop one so that the project can be sustainable and useful to those who wish to use it in the future.

TL;DR

Trove, with support for a dozen or so databases today, finds itself in a bind because there are few developers, and a code-base with a significant amount of technical debt.

Some architectural choices which the team made over the years have consequences which make the project less than ideal for deployers.

Given that there are no major production deployments of Trove at present, this provides us an opportunity to reset the project, learn from our v1 and come up with a strong v2.

An important aspect of making this proposal work is that we seek to eliminate the effort (planning, and coding) involved in migrating existing Trove v1 deployments to the proposed Trove v2. Effectively, with work beginning on Trove v2 as proposed here, Trove v1 as released with Pike will be marked as deprecated and users will have to migrate to Trove v2 when it becomes available.

While I would very much like to continue to support the users on Trove v1 through this transition, the simple fact is that absent community participation this will be impossible. Furthermore, given that there are no production deployments of Trove at this time, it seems pointless to build that upgrade path from Trove v1 to Trove v2; it would be the proverbial bridge from nowhere.

This (previous) statement is, I realize, contentious. There are those who have told me that an upgrade path must be provided, and there are those who have told me of unnamed deployments of Trove that would suffer. To this, all I can say is that if an upgrade path is of value to you, then please commit the development resources to participate in the community to make that possible. But equally, preventing a v2 of Trove or delaying it will only make the v1 that we have today less valuable.

We have learned a lot from v1, and the hope is that we can address that in v2. Some of the more significant things that I have learned are:

- We should adopt a versioned front-end API from the very beginning; making the REST API versioned is not a ‘v2 feature’

- A guest agent running on a tenant instance, with connectivity to a shared management message bus is a security loophole; encrypting traffic, per-tenant-passwords, and any other scheme is merely lipstick on a security hole

- Reliance on Nova for compute resources is fine, but dependence on Nova VM specific capabilities (like instance rebuild) is not; it makes things like containers or bare-metal second class citizens

- A fair portion of what Trove does is resource orchestration; don’t reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as far along when Trove got started but that’s not the case today and we have an opportunity to fix that now

- A similarly significant portion of what Trove does is to implement a state-machine that will perform specific workflows involved in implementing database specific operations. This makes the Trove taskmanager a stateful entity. Some of the operations could take a fair amount of time. This is a serious architectural flaw

- Tenants should not ever be able to directly interact with the underlying storage and compute used by database instances; that should be the default configuration, not an untested deployment alternative

- The CI should test all databases that are considered to be ‘supported’ without excessive use of resources in the gate; better code modularization will help determine the tests which can safely be skipped in testing changes

- Clusters should be first class citizens not an afterthought, single instance databases may be the ‘special case’, not the other way around

- The project must provide guest images (or at least complete tooling for deployers to build these); while the project can’t distribute operating systems and database software, the current deployment model merely impedes adoption

- Clusters spanning OpenStack deployments are a real thing that must be supported

This might sound harsh, that isn’t the intent. Each of these is the consequence of one or more perfectly rational decisions. Some of those decisions have had unintended consequences, and others were made knowing that we would be incurring some technical debt; debt we have not had the time or resources to address. Fixing all these is not impossible, it just takes the dedication of resources by the community.

I do not have a complete design for what the new Trove would look like. For example, I don’t know how we will interact with other projects (like Heat). Many questions remain to be explored and answered.

Would it suffice to just use the existing Heat resources and build templates around those, or will it be better to implement custom Trove resources and then orchestrate things based on those resources?

Would Trove implement the workflows required for multi-stage database operations by itself, or would it rely on some other project (say Mistral) for this? Is Mistral really a workflow service, or just cron on steroids? I don’t know the answer but I would like to find out.

While we don’t have the answers to these questions, I think this is a conversation that we must have, one that we must decide on, and then as a community commit the resources required to make a Trove v2 which delivers on the mission of the project; “To provide scalable and reliable Cloud Database as a Service provisioning functionality for both relational and non-relational database engines, and to continue to improve its fully-featured and extensible open source framework.”[2]

Thanks,

-amrith​


[1] https://www.openstack.org/assets/survey/April2017SurveyReport.pdf
[2] https://wiki.openstack.org/wiki/Trove#Mission_Statement



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [trove][all][tc] A proposal to rearchitect Trove

Matt Fischer
In reply to this post by Amrith Kumar-2
Amrith, 

Some good thoughts in your email. I've replied to a few specific pieces below. Overall I think it's a good start to a plan.

On Sun, Jun 18, 2017 at 5:35 AM, Amrith Kumar <[hidden email]> wrote:
Trove has evolved rapidly over the past several years, since integration in IceHouse when it only supported single instances of a few databases. Today it supports a dozen databases including clusters and replication.

The user survey [1] indicates that while there is strong interest in the project, there are few large production deployments that are known of (by the development team).

Recent changes in the OpenStack community at large (company realignments, acquisitions, layoffs) and the Trove community in particular, coupled with a mounting burden of technical debt have prompted me to make this proposal to re-architect Trove.

This email summarizes several of the issues that face the project, both structurally and architecturally. This email does not claim to include a detailed specification for what the new Trove would look like, merely the recommendation that the community should come together and develop one so that the project can be sustainable and useful to those who wish to use it in the future.

TL;DR

Trove, with support for a dozen or so databases today, finds itself in a bind because there are few developers, and a code-base with a significant amount of technical debt.

Some architectural choices which the team made over the years have consequences which make the project less than ideal for deployers.

Given that there are no major production deployments of Trove at present, this provides us an opportunity to reset the project, learn from our v1 and come up with a strong v2.

An important aspect of making this proposal work is that we seek to eliminate the effort (planning, and coding) involved in migrating existing Trove v1 deployments to the proposed Trove v2. Effectively, with work beginning on Trove v2 as proposed here, Trove v1 as released with Pike will be marked as deprecated and users will have to migrate to Trove v2 when it becomes available.

While I would very much like to continue to support the users on Trove v1 through this transition, the simple fact is that absent community participation this will be impossible. Furthermore, given that there are no production deployments of Trove at this time, it seems pointless to build that upgrade path from Trove v1 to Trove v2; it would be the proverbial bridge from nowhere.

This (previous) statement is, I realize, contentious. There are those who have told me that an upgrade path must be provided, and there are those who have told me of unnamed deployments of Trove that would suffer. To this, all I can say is that if an upgrade path is of value to you, then please commit the development resources to participate in the community to make that possible. But equally, preventing a v2 of Trove or delaying it will only make the v1 that we have today less valuable.

We have learned a lot from v1, and the hope is that we can address that in v2. Some of the more significant things that I have learned are:

- We should adopt a versioned front-end API from the very beginning; making the REST API versioned is not a ‘v2 feature’

- A guest agent running on a tenant instance, with connectivity to a shared management message bus is a security loophole; encrypting traffic, per-tenant-passwords, and any other scheme is merely lipstick on a security hole

This was a major concern when we deployed it and drove the architectural decisions. I'd be glad to see it resolved or re-architected.
 

- Reliance on Nova for compute resources is fine, but dependence on Nova VM specific capabilities (like instance rebuild) is not; it makes things like containers or bare-metal second class citizens

- A fair portion of what Trove does is resource orchestration; don’t reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as far along when Trove got started but that’s not the case today and we have an opportunity to fix that now

+1 
 

- A similarly significant portion of what Trove does is to implement a state-machine that will perform specific workflows involved in implementing database specific operations. This makes the Trove taskmanager a stateful entity. Some of the operations could take a fair amount of time. This is a serious architectural flaw

- Tenants should not ever be able to directly interact with the underlying storage and compute used by database instances; that should be the default configuration, not an untested deployment alternative

+1 to this also. Trove should offer a black box DB as a Service, not something the user sees as an instance+storage that they feel that they can manipulate.
 

- The CI should test all databases that are considered to be ‘supported’ without excessive use of resources in the gate; better code modularization will help determine the tests which can safely be skipped in testing changes

I would add that reducing the focus on adding more and more DBs, rather than having a few very well supported ones would help in your Trove reboot.
 

- Clusters should be first class citizens not an afterthought, single instance databases may be the ‘special case’, not the other way around

This is how we positioned Trove when it was deployed. A single node DB is not a very cloudy solution when you have to do maintenance or lose a hypervisor. I'd consider clusters the main use case. We discouraged anyone from using non-clustered solutions except for trying out Trove.
 

- The project must provide guest images (or at least complete tooling for deployers to build these); while the project can’t distribute operating systems and database software, the current deployment model merely impedes adoption

- Clusters spanning OpenStack deployments are a real thing that must be supported

or regions

 

This might sound harsh, that isn’t the intent. Each of these is the consequence of one or more perfectly rational decisions. Some of those decisions have had unintended consequences, and others were made knowing that we would be incurring some technical debt; debt we have not had the time or resources to address. Fixing all these is not impossible, it just takes the dedication of resources by the community.

I do not have a complete design for what the new Trove would look like. For example, I don’t know how we will interact with other projects (like Heat). Many questions remain to be explored and answered.

Would it suffice to just use the existing Heat resources and build templates around those, or will it be better to implement custom Trove resources and then orchestrate things based on those resources?

Would Trove implement the workflows required for multi-stage database operations by itself, or would it rely on some other project (say Mistral) for this? Is Mistral really a workflow service, or just cron on steroids? I don’t know the answer but I would like to find out.

While we don’t have the answers to these questions, I think this is a conversation that we must have, one that we must decide on, and then as a community commit the resources required to make a Trove v2 which delivers on the mission of the project; “To provide scalable and reliable Cloud Database as a Service provisioning functionality for both relational and non-relational database engines, and to continue to improve its fully-featured and extensible open source framework.”[2]

Thanks,

-amrith​


[1] https://www.openstack.org/assets/survey/April2017SurveyReport.pdf
[2] https://wiki.openstack.org/wiki/Trove#Mission_Statement



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [trove][all][tc] A proposal to rearchitect Trove

Curtis
In reply to this post by Amrith Kumar-2
On Sun, Jun 18, 2017 at 5:35 AM, Amrith Kumar <[hidden email]> wrote:

> Trove has evolved rapidly over the past several years, since integration in
> IceHouse when it only supported single instances of a few databases. Today
> it supports a dozen databases including clusters and replication.
>
> The user survey [1] indicates that while there is strong interest in the
> project, there are few large production deployments that are known of (by
> the development team).
>
> Recent changes in the OpenStack community at large (company realignments,
> acquisitions, layoffs) and the Trove community in particular, coupled with a
> mounting burden of technical debt have prompted me to make this proposal to
> re-architect Trove.
>
> This email summarizes several of the issues that face the project, both
> structurally and architecturally. This email does not claim to include a
> detailed specification for what the new Trove would look like, merely the
> recommendation that the community should come together and develop one so
> that the project can be sustainable and useful to those who wish to use it
> in the future.
>
> TL;DR
>
> Trove, with support for a dozen or so databases today, finds itself in a
> bind because there are few developers, and a code-base with a significant
> amount of technical debt.
>
> Some architectural choices which the team made over the years have
> consequences which make the project less than ideal for deployers.
>
> Given that there are no major production deployments of Trove at present,
> this provides us an opportunity to reset the project, learn from our v1 and
> come up with a strong v2.
>
> An important aspect of making this proposal work is that we seek to
> eliminate the effort (planning, and coding) involved in migrating existing
> Trove v1 deployments to the proposed Trove v2. Effectively, with work
> beginning on Trove v2 as proposed here, Trove v1 as released with Pike will
> be marked as deprecated and users will have to migrate to Trove v2 when it
> becomes available.
>
> While I would very much like to continue to support the users on Trove v1
> through this transition, the simple fact is that absent community
> participation this will be impossible. Furthermore, given that there are no
> production deployments of Trove at this time, it seems pointless to build
> that upgrade path from Trove v1 to Trove v2; it would be the proverbial
> bridge from nowhere.
>
> This (previous) statement is, I realize, contentious. There are those who
> have told me that an upgrade path must be provided, and there are those who
> have told me of unnamed deployments of Trove that would suffer. To this, all
> I can say is that if an upgrade path is of value to you, then please commit
> the development resources to participate in the community to make that
> possible. But equally, preventing a v2 of Trove or delaying it will only
> make the v1 that we have today less valuable.
>
> We have learned a lot from v1, and the hope is that we can address that in
> v2. Some of the more significant things that I have learned are:
>
> - We should adopt a versioned front-end API from the very beginning; making
> the REST API versioned is not a ‘v2 feature’
>
> - A guest agent running on a tenant instance, with connectivity to a shared
> management message bus is a security loophole; encrypting traffic,
> per-tenant-passwords, and any other scheme is merely lipstick on a security
> hole
>
> - Reliance on Nova for compute resources is fine, but dependence on Nova VM
> specific capabilities (like instance rebuild) is not; it makes things like
> containers or bare-metal second class citizens
>
> - A fair portion of what Trove does is resource orchestration; don’t
> reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as far
> along when Trove got started but that’s not the case today and we have an
> opportunity to fix that now
>
> - A similarly significant portion of what Trove does is to implement a
> state-machine that will perform specific workflows involved in implementing
> database specific operations. This makes the Trove taskmanager a stateful
> entity. Some of the operations could take a fair amount of time. This is a
> serious architectural flaw
>
> - Tenants should not ever be able to directly interact with the underlying
> storage and compute used by database instances; that should be the default
> configuration, not an untested deployment alternative
>

As an operator I wouldn't run Trove as it is, unless I absolutely had to.

I think it is a good idea to reboot the project. I really think the
concept of "service VMs" should be a thing. I'm not sure where the
OpenStack community has landed on that, my fault for not paying close
attention, but we should be able to create VMs for a tenant that are
not managed by the tenant but that could be billed to them in some
fashion. At least that's my opinion.

> - The CI should test all databases that are considered to be ‘supported’
> without excessive use of resources in the gate; better code modularization
> will help determine the tests which can safely be skipped in testing changes
>
> - Clusters should be first class citizens not an afterthought, single
> instance databases may be the ‘special case’, not the other way around

Definitely agree on that. Cluster first model.

>
> - The project must provide guest images (or at least complete tooling for
> deployers to build these); while the project can’t distribute operating
> systems and database software, the current deployment model merely impedes
> adoption
>
> - Clusters spanning OpenStack deployments are a real thing that must be
> supported
>

I'm curious as to how this will be done. This is a requirement in
NFV-land as well for other services. Would be very powerful and is
needed in other areas.

Thanks,
Curtis.

> This might sound harsh, that isn’t the intent. Each of these is the
> consequence of one or more perfectly rational decisions. Some of those
> decisions have had unintended consequences, and others were made knowing
> that we would be incurring some technical debt; debt we have not had the
> time or resources to address. Fixing all these is not impossible, it just
> takes the dedication of resources by the community.
>
> I do not have a complete design for what the new Trove would look like. For
> example, I don’t know how we will interact with other projects (like Heat).
> Many questions remain to be explored and answered.
>
> Would it suffice to just use the existing Heat resources and build templates
> around those, or will it be better to implement custom Trove resources and
> then orchestrate things based on those resources?
>
> Would Trove implement the workflows required for multi-stage database
> operations by itself, or would it rely on some other project (say Mistral)
> for this? Is Mistral really a workflow service, or just cron on steroids? I
> don’t know the answer but I would like to find out.
>
> While we don’t have the answers to these questions, I think this is a
> conversation that we must have, one that we must decide on, and then as a
> community commit the resources required to make a Trove v2 which delivers on
> the mission of the project; “To provide scalable and reliable Cloud Database
> as a Service provisioning functionality for both relational and
> non-relational database engines, and to continue to improve its
> fully-featured and extensible open source framework.”[2]
>
> Thanks,
>
> -amrith
>
>
> [1] https://www.openstack.org/assets/survey/April2017SurveyReport.pdf
> [2] https://wiki.openstack.org/wiki/Trove#Mission_Statement
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [hidden email]?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



--
Blog: serverascode.com

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [trove][all][tc] A proposal to rearchitect Trove

Fei Long Wang-2


On 20/06/17 12:56, Curtis wrote:

> On Sun, Jun 18, 2017 at 5:35 AM, Amrith Kumar <[hidden email]> wrote:
>> Trove has evolved rapidly over the past several years, since integration in
>> IceHouse when it only supported single instances of a few databases. Today
>> it supports a dozen databases including clusters and replication.
>>
>> The user survey [1] indicates that while there is strong interest in the
>> project, there are few large production deployments that are known of (by
>> the development team).
>>
>> Recent changes in the OpenStack community at large (company realignments,
>> acquisitions, layoffs) and the Trove community in particular, coupled with a
>> mounting burden of technical debt have prompted me to make this proposal to
>> re-architect Trove.
>>
>> This email summarizes several of the issues that face the project, both
>> structurally and architecturally. This email does not claim to include a
>> detailed specification for what the new Trove would look like, merely the
>> recommendation that the community should come together and develop one so
>> that the project can be sustainable and useful to those who wish to use it
>> in the future.
>>
>> TL;DR
>>
>> Trove, with support for a dozen or so databases today, finds itself in a
>> bind because there are few developers, and a code-base with a significant
>> amount of technical debt.
>>
>> Some architectural choices which the team made over the years have
>> consequences which make the project less than ideal for deployers.
>>
>> Given that there are no major production deployments of Trove at present,
>> this provides us an opportunity to reset the project, learn from our v1 and
>> come up with a strong v2.
>>
>> An important aspect of making this proposal work is that we seek to
>> eliminate the effort (planning, and coding) involved in migrating existing
>> Trove v1 deployments to the proposed Trove v2. Effectively, with work
>> beginning on Trove v2 as proposed here, Trove v1 as released with Pike will
>> be marked as deprecated and users will have to migrate to Trove v2 when it
>> becomes available.
>>
>> While I would very much like to continue to support the users on Trove v1
>> through this transition, the simple fact is that absent community
>> participation this will be impossible. Furthermore, given that there are no
>> production deployments of Trove at this time, it seems pointless to build
>> that upgrade path from Trove v1 to Trove v2; it would be the proverbial
>> bridge from nowhere.
>>
>> This (previous) statement is, I realize, contentious. There are those who
>> have told me that an upgrade path must be provided, and there are those who
>> have told me of unnamed deployments of Trove that would suffer. To this, all
>> I can say is that if an upgrade path is of value to you, then please commit
>> the development resources to participate in the community to make that
>> possible. But equally, preventing a v2 of Trove or delaying it will only
>> make the v1 that we have today less valuable.
>>
>> We have learned a lot from v1, and the hope is that we can address that in
>> v2. Some of the more significant things that I have learned are:
>>
>> - We should adopt a versioned front-end API from the very beginning; making
>> the REST API versioned is not a ‘v2 feature’
>>
>> - A guest agent running on a tenant instance, with connectivity to a shared
>> management message bus is a security loophole; encrypting traffic,
>> per-tenant-passwords, and any other scheme is merely lipstick on a security
>> hole
>>
>> - Reliance on Nova for compute resources is fine, but dependence on Nova VM
>> specific capabilities (like instance rebuild) is not; it makes things like
>> containers or bare-metal second class citizens
>>
>> - A fair portion of what Trove does is resource orchestration; don’t
>> reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as far
>> along when Trove got started but that’s not the case today and we have an
>> opportunity to fix that now
>>
>> - A similarly significant portion of what Trove does is to implement a
>> state-machine that will perform specific workflows involved in implementing
>> database specific operations. This makes the Trove taskmanager a stateful
>> entity. Some of the operations could take a fair amount of time. This is a
>> serious architectural flaw
>>
>> - Tenants should not ever be able to directly interact with the underlying
>> storage and compute used by database instances; that should be the default
>> configuration, not an untested deployment alternative
>>
> As an operator I wouldn't run Trove as it is, unless I absolutely had to.
>
> I think it is a good idea to reboot the project. I really think the
> concept of "service VMs" should be a thing. I'm not sure where the
> OpenStack community has landed on that, my fault for not paying close
> attention, but we should be able to create VMs for a tenant that are
> not managed by the tenant but that could be billed to them in some
> fashion. At least that's my opinion.

Re the 'service VMs', yep, it could be very useful. And in Zaqar, we're
working on a spec to support 'service queue', similar like the 'service
VMs', so that the service user can create queues in user's tenant. And I
can imagine Trove could benefit from that feature as well.

>
>> - The CI should test all databases that are considered to be ‘supported’
>> without excessive use of resources in the gate; better code modularization
>> will help determine the tests which can safely be skipped in testing changes
>>
>> - Clusters should be first class citizens not an afterthought, single
>> instance databases may be the ‘special case’, not the other way around
> Definitely agree on that. Cluster first model.
>
>> - The project must provide guest images (or at least complete tooling for
>> deployers to build these); while the project can’t distribute operating
>> systems and database software, the current deployment model merely impedes
>> adoption
>>
>> - Clusters spanning OpenStack deployments are a real thing that must be
>> supported
>>
> I'm curious as to how this will be done. This is a requirement in
> NFV-land as well for other services. Would be very powerful and is
> needed in other areas.
>
> Thanks,
> Curtis.
>
>> This might sound harsh, that isn’t the intent. Each of these is the
>> consequence of one or more perfectly rational decisions. Some of those
>> decisions have had unintended consequences, and others were made knowing
>> that we would be incurring some technical debt; debt we have not had the
>> time or resources to address. Fixing all these is not impossible, it just
>> takes the dedication of resources by the community.
>>
>> I do not have a complete design for what the new Trove would look like. For
>> example, I don’t know how we will interact with other projects (like Heat).
>> Many questions remain to be explored and answered.
>>
>> Would it suffice to just use the existing Heat resources and build templates
>> around those, or will it be better to implement custom Trove resources and
>> then orchestrate things based on those resources?
>>
>> Would Trove implement the workflows required for multi-stage database
>> operations by itself, or would it rely on some other project (say Mistral)
>> for this? Is Mistral really a workflow service, or just cron on steroids? I
>> don’t know the answer but I would like to find out.
>>
>> While we don’t have the answers to these questions, I think this is a
>> conversation that we must have, one that we must decide on, and then as a
>> community commit the resources required to make a Trove v2 which delivers on
>> the mission of the project; “To provide scalable and reliable Cloud Database
>> as a Service provisioning functionality for both relational and
>> non-relational database engines, and to continue to improve its
>> fully-featured and extensible open source framework.”[2]
>>
>> Thanks,
>>
>> -amrith
>>
>>
>> [1] https://www.openstack.org/assets/survey/April2017SurveyReport.pdf
>> [2] https://wiki.openstack.org/wiki/Trove#Mission_Statement
>>
>>
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: [hidden email]?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>

--
Cheers & Best regards,
Feilong Wang (王飞龙)
--------------------------------------------------------------------------
Senior Cloud Software Engineer
Tel: +64-48032246
Email: [hidden email]
Catalyst IT Limited
Level 6, Catalyst House, 150 Willis Street, Wellington
--------------------------------------------------------------------------



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [trove][all][tc] A proposal to rearchitect Trove

Zane Bitter
In reply to this post by Curtis
On 19/06/17 20:56, Curtis wrote:
> I really think the
> concept of "service VMs" should be a thing. I'm not sure where the
> OpenStack community has landed on that, my fault for not paying close
> attention, but we should be able to create VMs for a tenant that are
> not managed by the tenant but that could be billed to them in some
> fashion. At least that's my opinion.

https://review.openstack.org/#/c/438134/

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [trove][all][tc] A proposal to rearchitect Trove

Doug Hellmann-2
In reply to this post by Curtis
Excerpts from Curtis's message of 2017-06-19 18:56:25 -0600:

> On Sun, Jun 18, 2017 at 5:35 AM, Amrith Kumar <[hidden email]> wrote:
> > Trove has evolved rapidly over the past several years, since integration in
> > IceHouse when it only supported single instances of a few databases. Today
> > it supports a dozen databases including clusters and replication.
> >
> > The user survey [1] indicates that while there is strong interest in the
> > project, there are few large production deployments that are known of (by
> > the development team).
> >
> > Recent changes in the OpenStack community at large (company realignments,
> > acquisitions, layoffs) and the Trove community in particular, coupled with a
> > mounting burden of technical debt have prompted me to make this proposal to
> > re-architect Trove.
> >
> > This email summarizes several of the issues that face the project, both
> > structurally and architecturally. This email does not claim to include a
> > detailed specification for what the new Trove would look like, merely the
> > recommendation that the community should come together and develop one so
> > that the project can be sustainable and useful to those who wish to use it
> > in the future.
> >
> > TL;DR
> >
> > Trove, with support for a dozen or so databases today, finds itself in a
> > bind because there are few developers, and a code-base with a significant
> > amount of technical debt.
> >
> > Some architectural choices which the team made over the years have
> > consequences which make the project less than ideal for deployers.
> >
> > Given that there are no major production deployments of Trove at present,
> > this provides us an opportunity to reset the project, learn from our v1 and
> > come up with a strong v2.
> >
> > An important aspect of making this proposal work is that we seek to
> > eliminate the effort (planning, and coding) involved in migrating existing
> > Trove v1 deployments to the proposed Trove v2. Effectively, with work
> > beginning on Trove v2 as proposed here, Trove v1 as released with Pike will
> > be marked as deprecated and users will have to migrate to Trove v2 when it
> > becomes available.
> >
> > While I would very much like to continue to support the users on Trove v1
> > through this transition, the simple fact is that absent community
> > participation this will be impossible. Furthermore, given that there are no
> > production deployments of Trove at this time, it seems pointless to build
> > that upgrade path from Trove v1 to Trove v2; it would be the proverbial
> > bridge from nowhere.
> >
> > This (previous) statement is, I realize, contentious. There are those who
> > have told me that an upgrade path must be provided, and there are those who
> > have told me of unnamed deployments of Trove that would suffer. To this, all
> > I can say is that if an upgrade path is of value to you, then please commit
> > the development resources to participate in the community to make that
> > possible. But equally, preventing a v2 of Trove or delaying it will only
> > make the v1 that we have today less valuable.
> >
> > We have learned a lot from v1, and the hope is that we can address that in
> > v2. Some of the more significant things that I have learned are:
> >
> > - We should adopt a versioned front-end API from the very beginning; making
> > the REST API versioned is not a ‘v2 feature’
> >
> > - A guest agent running on a tenant instance, with connectivity to a shared
> > management message bus is a security loophole; encrypting traffic,
> > per-tenant-passwords, and any other scheme is merely lipstick on a security
> > hole
> >
> > - Reliance on Nova for compute resources is fine, but dependence on Nova VM
> > specific capabilities (like instance rebuild) is not; it makes things like
> > containers or bare-metal second class citizens
> >
> > - A fair portion of what Trove does is resource orchestration; don’t
> > reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as far
> > along when Trove got started but that’s not the case today and we have an
> > opportunity to fix that now
> >
> > - A similarly significant portion of what Trove does is to implement a
> > state-machine that will perform specific workflows involved in implementing
> > database specific operations. This makes the Trove taskmanager a stateful
> > entity. Some of the operations could take a fair amount of time. This is a
> > serious architectural flaw
> >
> > - Tenants should not ever be able to directly interact with the underlying
> > storage and compute used by database instances; that should be the default
> > configuration, not an untested deployment alternative
> >
>
> As an operator I wouldn't run Trove as it is, unless I absolutely had to.
>
> I think it is a good idea to reboot the project. I really think the
> concept of "service VMs" should be a thing. I'm not sure where the
> OpenStack community has landed on that, my fault for not paying close
> attention, but we should be able to create VMs for a tenant that are
> not managed by the tenant but that could be billed to them in some
> fashion. At least that's my opinion.

Does "service VM" need to be a first-class thing?  Akanda creates
them, using a service user. The VMs are tied to a "router" which
is the billable resource that the user understands and interacts with
through the API.

Doug

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [trove][all][tc] A proposal to rearchitect Trove

Jay Pipes
On 06/20/2017 09:42 AM, Doug Hellmann wrote:
> Does "service VM" need to be a first-class thing?  Akanda creates
> them, using a service user. The VMs are tied to a "router" which
> is the billable resource that the user understands and interacts with
> through the API.

Frankly, I believe all of these types of services should be built as
applications that run on OpenStack (or other) infrastructure. In other
words, they should not be part of the infrastructure itself.

There's really no need for a user of a DBaaS to have access to the host
or hosts the DB is running on. If the user really wanted that, they
would just spin up a VM/baremetal server and install the thing themselves.

Best,
-jay

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [trove][all][tc] A proposal to rearchitect Trove

Zane Bitter
In reply to this post by Amrith Kumar-2
On 18/06/17 07:35, Amrith Kumar wrote:

> Trove has evolved rapidly over the past several years, since integration
> in IceHouse when it only supported single instances of a few databases.
> Today it supports a dozen databases including clusters and replication.
>
> The user survey [1] indicates that while there is strong interest in the
> project, there are few large production deployments that are known of
> (by the development team).
>
> Recent changes in the OpenStack community at large (company
> realignments, acquisitions, layoffs) and the Trove community in
> particular, coupled with a mounting burden of technical debt have
> prompted me to make this proposal to re-architect Trove.
>
> This email summarizes several of the issues that face the project, both
> structurally and architecturally. This email does not claim to include a
> detailed specification for what the new Trove would look like, merely
> the recommendation that the community should come together and develop
> one so that the project can be sustainable and useful to those who wish
> to use it in the future.
>
> TL;DR
>
> Trove, with support for a dozen or so databases today, finds itself in a
> bind because there are few developers, and a code-base with a
> significant amount of technical debt.
>
> Some architectural choices which the team made over the years have
> consequences which make the project less than ideal for deployers.
>
> Given that there are no major production deployments of Trove at
> present, this provides us an opportunity to reset the project, learn
> from our v1 and come up with a strong v2.
>
> An important aspect of making this proposal work is that we seek to
> eliminate the effort (planning, and coding) involved in migrating
> existing Trove v1 deployments to the proposed Trove v2. Effectively,
> with work beginning on Trove v2 as proposed here, Trove v1 as released
> with Pike will be marked as deprecated and users will have to migrate to
> Trove v2 when it becomes available.

I'm personally fine with not having a migration path (because I'm not
personally running Trove v1 ;) although Thierry's point about choosing a
different name is valid and surely something the TC will want to weigh
in on.

However, I am always concerned about throwing out working code and
rewriting from scratch. I'd be more comfortable if I saw some value
being salvaged from the existing Trove project, other than as just an
extended PoC/learning exercise. Would the API be similar to the current
Trove one? Can at least some tests be salvaged to rapidly increase
confidence that the new code works as expected?

> While I would very much like to continue to support the users on Trove
> v1 through this transition, the simple fact is that absent community
> participation this will be impossible. Furthermore, given that there are
> no production deployments of Trove at this time, it seems pointless to
> build that upgrade path from Trove v1 to Trove v2; it would be the
> proverbial bridge from nowhere.
>
> This (previous) statement is, I realize, contentious. There are those
> who have told me that an upgrade path must be provided, and there are
> those who have told me of unnamed deployments of Trove that would
> suffer. To this, all I can say is that if an upgrade path is of value to
> you, then please commit the development resources to participate in the
> community to make that possible. But equally, preventing a v2 of Trove
> or delaying it will only make the v1 that we have today less valuable.
>
> We have learned a lot from v1, and the hope is that we can address that
> in v2. Some of the more significant things that I have learned are:
>
> - We should adopt a versioned front-end API from the very beginning;
> making the REST API versioned is not a ‘v2 feature’
>
> - A guest agent running on a tenant instance, with connectivity to a
> shared management message bus is a security loophole; encrypting
> traffic, per-tenant-passwords, and any other scheme is merely lipstick
> on a security hole

Totally agree here, any component of the architecture that is accessed
directly by multiple tenants needs to be natively multi-tenant. I
believe this has been one of the barriers to adoption.

> - Reliance on Nova for compute resources is fine, but dependence on Nova
> VM specific capabilities (like instance rebuild) is not; it makes things
> like containers or bare-metal second class citizens
>
> - A fair portion of what Trove does is resource orchestration; don’t
> reinvent the wheel, there’s Heat for that. Admittedly, Heat wasn’t as
> far along when Trove got started but that’s not the case today and we
> have an opportunity to fix that now

+1, obviously ;)

Although I also think Kevin's suggestion is worthy of serious consideration.

> - A similarly significant portion of what Trove does is to implement a
> state-machine that will perform specific workflows involved in
> implementing database specific operations. This makes the Trove
> taskmanager a stateful entity. Some of the operations could take a fair
> amount of time. This is a serious architectural flaw
>
> - Tenants should not ever be able to directly interact with the
> underlying storage and compute used by database instances; that should
> be the default configuration, not an untested deployment alternative
>
> - The CI should test all databases that are considered to be ‘supported’
> without excessive use of resources in the gate; better code
> modularization will help determine the tests which can safely be skipped
> in testing changes
>
> - Clusters should be first class citizens not an afterthought, single
> instance databases may be the ‘special case’, not the other way around
>
> - The project must provide guest images (or at least complete tooling
> for deployers to build these); while the project can’t distribute
> operating systems and database software, the current deployment model
> merely impedes adoption
>
> - Clusters spanning OpenStack deployments are a real thing that must be
> supported
>
> This might sound harsh, that isn’t the intent. Each of these is the
> consequence of one or more perfectly rational decisions. Some of those
> decisions have had unintended consequences, and others were made knowing
> that we would be incurring some technical debt; debt we have not had the
> time or resources to address. Fixing all these is not impossible, it
> just takes the dedication of resources by the community.
>
> I do not have a complete design for what the new Trove would look like.
> For example, I don’t know how we will interact with other projects (like
> Heat). Many questions remain to be explored and answered.
>
> Would it suffice to just use the existing Heat resources and build
> templates around those, or will it be better to implement custom Trove
> resources and then orchestrate things based on those resources?

(Context: Amrith and I discussed this already)

The idea here is that there are some things that the Heat 'workflow'
doesn't handle by itself - for example, quiescing a server prior to
rebuilding (as opposed to replacing) it. The most obvious way to do that
(discussed in Amrith's next paragraph) is to drive it from some workflow
outside of Heat, with a Heat stack update to rebuild the server as one
of the steps. However, an alternative might be to implement custom Heat
resources that codify the required workflow.

IMHO this doesn't really improve the problem described above ("This
makes the Trove taskmanager a stateful entity. Some of the operations
could take a fair amount of time. This is a serious architectural flaw")
so much as move it around - Heat persists state at the resource level,
but isn't really well set up to handle a lot of state within a resource.

> Would Trove implement the workflows required for multi-stage database
> operations by itself,

One option to look at here is the taskflow library that Josh and others
wrote. It works well for the case where the workflow can be hard-coded
in code (which I think may fit this use case). It's already used by
Cinder, and perhaps other projects.

> or would it rely on some other project (say
> Mistral) for this? Is Mistral really a workflow service, or just cron on
> steroids? I don’t know the answer but I would like to find out.

Mistral really is a workflow service. It uses YAML rather than Python to
define workflows, so it's better than taskflow for the case where the
workflow needs to be generated at runtime. Obviously it also has the
advantage of a multi-tenant REST API, so it can provide a plugability
point for users to customise. It's possible that neither of those
advantages are relevant in this situation.

One potential advantage of Mistral is that the workflows can be set up
as part of a Heat template. If all of the workflows were set up like
that, it would be easy for users to use the generated templates as a
private database management layer on a cloud that didn't offer it
as-a-Service.

The disadvantage, obviously, is that it requires the cloud to offer
Mistral as-a-Service, which currently doesn't include nearly as many
clouds as I'd like.

> While we don’t have the answers to these questions, I think this is a
> conversation that we must have, one that we must decide on, and then as
> a community commit the resources required to make a Trove v2 which
> delivers on the mission of the project; “To provide scalable and
> reliable Cloud Database as a Service provisioning functionality for both
> relational and non-relational database engines, and to continue to
> improve its fully-featured and extensible open source framework.”[2]

+1

cheers,
Zane.

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [trove][all][tc] A proposal to rearchitect Trove

Zane Bitter
In reply to this post by Jay Pipes
On 20/06/17 10:08, Jay Pipes wrote:

> On 06/20/2017 09:42 AM, Doug Hellmann wrote:
>> Does "service VM" need to be a first-class thing?  Akanda creates
>> them, using a service user. The VMs are tied to a "router" which
>> is the billable resource that the user understands and interacts with
>> through the API.
>
> Frankly, I believe all of these types of services should be built as
> applications that run on OpenStack (or other) infrastructure. In other
> words, they should not be part of the infrastructure itself.
>
> There's really no need for a user of a DBaaS to have access to the host
> or hosts the DB is running on. If the user really wanted that, they
> would just spin up a VM/baremetal server and install the thing themselves.

Hey Jay,
I'd be interested in exploring this idea with you, because I think
everyone agrees that this would be a good goal, but at least in my mind
it's not obvious what the technical solution should be. (Actually, I've
read your email a bunch of times now, and I go back and forth on which
one you're actually advocating for.) The two options, as I see it, are
as follows:

1) The database VMs are created in the user's tena^W project. They
connect directly to the tenant's networks, are governed by the user's
quota, and are billed to the project as Nova VMs (on top of whatever
additional billing might come along with the management services). A
[future] feature in Nova (https://review.openstack.org/#/c/438134/)
allows the Trove service to lock down access so that the user cannot
actually interact with the server using Nova, but must go through the
Trove API. On a cloud that doesn't include Trove, a user could run Trove
as an application themselves and all it would have to do differently is
not pass the service token to lock down the VM.

alternatively:

2) The database VMs are created in a project belonging to the operator
of the service. They're connected to the user's network through <magic>,
and isolated from other users' databases running in the same project
through <security groups? hierarchical projects? magic?>. Trove has its
own quota management and billing. The user cannot interact with the
server using Nova since it is owned by a different project. On a cloud
that doesn't include Trove, a user could run Trove as an application
themselves, by giving it credentials for their own project and disabling
all of the cross-tenant networking stuff.

Of course the current situation, as Amrith alluded to, where the default
is option (1) except without the lock-down feature in Nova, though some
operators are deploying option (2) but it's not tested upstream...
clearly that's the worst of all possible worlds, and AIUI nobody
disagrees with that.

To my mind, (1) sounds more like "applications that run on OpenStack (or
other) infrastructure", since it doesn't require stuff like the
admin-only cross-project networking that makes it effectively "part of
the infrastructure itself" - as evidenced by the fact that unprivileged
users can run it standalone with little more than a simple auth
middleware change. But I suspect you are going to use similar logic to
argue for (2)? I'd be interested to hear your thoughts.

cheers,
Zane.

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [trove][all][tc] A proposal to rearchitect Trove

Jay Pipes
Good discussion, Zane. Comments inline.

On 06/20/2017 11:01 AM, Zane Bitter wrote:

> On 20/06/17 10:08, Jay Pipes wrote:
>> On 06/20/2017 09:42 AM, Doug Hellmann wrote:
>>> Does "service VM" need to be a first-class thing?  Akanda creates
>>> them, using a service user. The VMs are tied to a "router" which
>>> is the billable resource that the user understands and interacts with
>>> through the API.
>>
>> Frankly, I believe all of these types of services should be built as
>> applications that run on OpenStack (or other) infrastructure. In other
>> words, they should not be part of the infrastructure itself.
>>
>> There's really no need for a user of a DBaaS to have access to the
>> host or hosts the DB is running on. If the user really wanted that,
>> they would just spin up a VM/baremetal server and install the thing
>> themselves.
>
> Hey Jay,
> I'd be interested in exploring this idea with you, because I think
> everyone agrees that this would be a good goal, but at least in my mind
> it's not obvious what the technical solution should be. (Actually, I've
> read your email a bunch of times now, and I go back and forth on which
> one you're actually advocating for.) The two options, as I see it, are
> as follows:
>
> 1) The database VMs are created in the user's tena^W project. They
> connect directly to the tenant's networks, are governed by the user's
> quota, and are billed to the project as Nova VMs (on top of whatever
> additional billing might come along with the management services). A
> [future] feature in Nova (https://review.openstack.org/#/c/438134/)
> allows the Trove service to lock down access so that the user cannot
> actually interact with the server using Nova, but must go through the
> Trove API. On a cloud that doesn't include Trove, a user could run Trove
> as an application themselves and all it would have to do differently is
> not pass the service token to lock down the VM.
>
> alternatively:
>
> 2) The database VMs are created in a project belonging to the operator
> of the service. They're connected to the user's network through <magic>,
> and isolated from other users' databases running in the same project
> through <security groups? hierarchical projects? magic?>. Trove has its
> own quota management and billing. The user cannot interact with the
> server using Nova since it is owned by a different project. On a cloud
> that doesn't include Trove, a user could run Trove as an application
> themselves, by giving it credentials for their own project and disabling
> all of the cross-tenant networking stuff.

None of the above :)

Don't think about VMs at all. Or networking plumbing. Or volume storage
or any of that.

Think only in terms of what a user of a DBaaS really wants. At the end
of the day, all they want is an address in the cloud where they can
point their application to write and read data from.

Do they want that data connection to be fast and reliable? Of course,
but how that happens is irrelevant to them

Do they want that data to be safe and backed up? Of course, but how that
happens is irrelevant to them.

The problem with many of these high-level *aaS projects is that they
consider their user to be a typical tenant of general cloud
infrastructure -- focused on launching VMs and creating volumes and
networks etc. And the discussions around the implementation of these
projects always comes back to minutia about how to set up secure
communication channels between a control plane message bus and the
service VMs.

If you create these projects as applications that run on cloud
infrastructure (OpenStack, k8s or otherwise), then the discussions focus
instead on how the real end-users -- the ones that actually call the
APIs and utilize the service -- would interact with the APIs and not the
underlying infrastructure itself.

Here's an example to think about...

What if a provider of this DBaaS service wanted to jam 100 database
instances on a single VM and provide connectivity to those database
instances to 100 different tenants?

Would those tenants know if those databases were all serviced from a
single database server process running on the VM? Or 100 contains each
running a separate database server process? Or 10 containers running 10
database server processes each?

No, of course not. And the tenant wouldn't care at all, because the
point of the DBaaS service is to get a database. It isn't to get one or
more VMs/containers/baremetal servers.

At the end of the day, I think Trove is best implemented as a hosted
application that exposes an API to its users that is entirely separate
from the underlying infrastructure APIs like Cinder/Nova/Neutron.

This is similar to Kevin's k8s Operator idea, which I support but in a
generic fashion that isn't specific to k8s.

In the same way that k8s abstracts the underlying infrastructure (via
its "cloud provider" concept), I think that Trove and similar projects
need to use a similar abstraction and focus on providing a different API
to their users that doesn't leak the underlying infrastructure API
concepts out.

Best,
-jay

> Of course the current situation, as Amrith alluded to, where the default
> is option (1) except without the lock-down feature in Nova, though some
> operators are deploying option (2) but it's not tested upstream...
> clearly that's the worst of all possible worlds, and AIUI nobody
> disagrees with that.
>
> To my mind, (1) sounds more like "applications that run on OpenStack (or
> other) infrastructure", since it doesn't require stuff like the
> admin-only cross-project networking that makes it effectively "part of
> the infrastructure itself" - as evidenced by the fact that unprivileged
> users can run it standalone with little more than a simple auth
> middleware change. But I suspect you are going to use similar logic to
> argue for (2)? I'd be interested to hear your thoughts.
>
> cheers,
> Zane.
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [hidden email]?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [trove][all][tc] A proposal to rearchitect Trove

Michael Bayer


On 06/20/2017 11:45 AM, Jay Pipes wrote:

> Good discussion, Zane. Comments inline.
>
> On 06/20/2017 11:01 AM, Zane Bitter wrote:
>>
>> 2) The database VMs are created in a project belonging to the operator
>> of the service. They're connected to the user's network through
>> <magic>, and isolated from other users' databases running in the same
>> project through <security groups? hierarchical projects? magic?>.
>> Trove has its own quota management and billing. The user cannot
>> interact with the server using Nova since it is owned by a different
>> project. On a cloud that doesn't include Trove, a user could run Trove
>> as an application themselves, by giving it credentials for their own
>> project and disabling all of the cross-tenant networking stuff.
>
> None of the above :)
>
> Don't think about VMs at all. Or networking plumbing. Or volume storage
> or any of that.
>
> Think only in terms of what a user of a DBaaS really wants. At the end
> of the day, all they want is an address in the cloud where they can
> point their application to write and read data from.
>
> Do they want that data connection to be fast and reliable? Of course,
> but how that happens is irrelevant to them
>
> Do they want that data to be safe and backed up? Of course, but how that
> happens is irrelevant to them.

Hi, I'm just newb trying to follow along...isnt that what #2 is
proposing?  just it's talking about the implementation a bit.

(Guess this comes down to the terms "user" and "operator" - e.g.
"operator" has the VMs w/ the DBs, "user" gets a login to a DB.  "user"
is the person who pushes the trove button to "give me a database")



>
> The problem with many of these high-level *aaS projects is that they
> consider their user to be a typical tenant of general cloud
> infrastructure -- focused on launching VMs and creating volumes and
> networks etc. And the discussions around the implementation of these
> projects always comes back to minutia about how to set up secure
> communication channels between a control plane message bus and the
> service VMs.
>
> If you create these projects as applications that run on cloud
> infrastructure (OpenStack, k8s or otherwise), then the discussions focus
> instead on how the real end-users -- the ones that actually call the
> APIs and utilize the service -- would interact with the APIs and not the
> underlying infrastructure itself.
>
> Here's an example to think about...
>
> What if a provider of this DBaaS service wanted to jam 100 database
> instances on a single VM and provide connectivity to those database
> instances to 100 different tenants?
>
> Would those tenants know if those databases were all serviced from a
> single database server process running on the VM? Or 100 contains each
> running a separate database server process? Or 10 containers running 10
> database server processes each?
>
> No, of course not. And the tenant wouldn't care at all, because the
> point of the DBaaS service is to get a database. It isn't to get one or
> more VMs/containers/baremetal servers.
>
> At the end of the day, I think Trove is best implemented as a hosted
> application that exposes an API to its users that is entirely separate
> from the underlying infrastructure APIs like Cinder/Nova/Neutron.
>
> This is similar to Kevin's k8s Operator idea, which I support but in a
> generic fashion that isn't specific to k8s.
>
> In the same way that k8s abstracts the underlying infrastructure (via
> its "cloud provider" concept), I think that Trove and similar projects
> need to use a similar abstraction and focus on providing a different API
> to their users that doesn't leak the underlying infrastructure API
> concepts out.
>
> Best,
> -jay
>
>> Of course the current situation, as Amrith alluded to, where the
>> default is option (1) except without the lock-down feature in Nova,
>> though some operators are deploying option (2) but it's not tested
>> upstream... clearly that's the worst of all possible worlds, and AIUI
>> nobody disagrees with that.
>>
>> To my mind, (1) sounds more like "applications that run on OpenStack
>> (or other) infrastructure", since it doesn't require stuff like the
>> admin-only cross-project networking that makes it effectively "part of
>> the infrastructure itself" - as evidenced by the fact that
>> unprivileged users can run it standalone with little more than a
>> simple auth middleware change. But I suspect you are going to use
>> similar logic to argue for (2)? I'd be interested to hear your thoughts.
>>
>> cheers,
>> Zane.
>>
>> __________________________________________________________________________
>>
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> [hidden email]?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [hidden email]?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [trove][all][tc] A proposal to rearchitect Trove

Clint Byrum
In reply to this post by Jay Pipes
Excerpts from Jay Pipes's message of 2017-06-20 10:08:54 -0400:

> On 06/20/2017 09:42 AM, Doug Hellmann wrote:
> > Does "service VM" need to be a first-class thing?  Akanda creates
> > them, using a service user. The VMs are tied to a "router" which
> > is the billable resource that the user understands and interacts with
> > through the API.
>
> Frankly, I believe all of these types of services should be built as
> applications that run on OpenStack (or other) infrastructure. In other
> words, they should not be part of the infrastructure itself.
>
> There's really no need for a user of a DBaaS to have access to the host
> or hosts the DB is running on. If the user really wanted that, they
> would just spin up a VM/baremetal server and install the thing themselves.
>

There's one reason, and that is specialized resources that we don't
trust to be multi-tenant.

Baremetal done multi-tenant is hard, just ask our friends who were/are
running OnMetal. But baremetal done for the purposes of running MySQL
clusters that only allow users to access MySQL and control everything
via an agent of sorts is a lot simpler. You can let them all share a
layer 2 with no MAC filtering for instance, since you are in control at
the OS level.

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [trove][all][tc] A proposal to rearchitect Trove

Zane Bitter
In reply to this post by Jay Pipes
On 20/06/17 11:45, Jay Pipes wrote:
> Good discussion, Zane. Comments inline.

++

> On 06/20/2017 11:01 AM, Zane Bitter wrote:
>> On 20/06/17 10:08, Jay Pipes wrote:
>>> On 06/20/2017 09:42 AM, Doug Hellmann wrote:
>>>> Does "service VM" need to be a first-class thing?  Akanda creates
>>>> them, using a service user. The VMs are tied to a "router" which
>>>> is the billable resource that the user understands and interacts with
>>>> through the API.
>>>
>>> Frankly, I believe all of these types of services should be built as
>>> applications that run on OpenStack (or other) infrastructure. In
>>> other words, they should not be part of the infrastructure itself.
>>>
>>> There's really no need for a user of a DBaaS to have access to the
>>> host or hosts the DB is running on. If the user really wanted that,
>>> they would just spin up a VM/baremetal server and install the thing
>>> themselves.
>>
>> Hey Jay,
>> I'd be interested in exploring this idea with you, because I think
>> everyone agrees that this would be a good goal, but at least in my
>> mind it's not obvious what the technical solution should be.
>> (Actually, I've read your email a bunch of times now, and I go back
>> and forth on which one you're actually advocating for.) The two
>> options, as I see it, are as follows:
>>
>> 1) The database VMs are created in the user's tena^W project. They
>> connect directly to the tenant's networks, are governed by the user's
>> quota, and are billed to the project as Nova VMs (on top of whatever
>> additional billing might come along with the management services). A
>> [future] feature in Nova (https://review.openstack.org/#/c/438134/)
>> allows the Trove service to lock down access so that the user cannot
>> actually interact with the server using Nova, but must go through the
>> Trove API. On a cloud that doesn't include Trove, a user could run
>> Trove as an application themselves and all it would have to do
>> differently is not pass the service token to lock down the VM.
>>
>> alternatively:
>>
>> 2) The database VMs are created in a project belonging to the operator
>> of the service. They're connected to the user's network through
>> <magic>, and isolated from other users' databases running in the same
>> project through <security groups? hierarchical projects? magic?>.
>> Trove has its own quota management and billing. The user cannot
>> interact with the server using Nova since it is owned by a different
>> project. On a cloud that doesn't include Trove, a user could run Trove
>> as an application themselves, by giving it credentials for their own
>> project and disabling all of the cross-tenant networking stuff.
>
> None of the above :)
>
> Don't think about VMs at all. Or networking plumbing. Or volume storage
> or any of that.

OK, but somebody has to ;)

> Think only in terms of what a user of a DBaaS really wants. At the end
> of the day, all they want is an address in the cloud where they can
> point their application to write and read data from.
>
> Do they want that data connection to be fast and reliable? Of course,
> but how that happens is irrelevant to them
>
> Do they want that data to be safe and backed up? Of course, but how that
> happens is irrelevant to them.

Fair enough. The world has changed a lot since RDS (which was the model
for Trove) was designed, it's certainly worth reviewing the base
assumptions before embarking on a new design.

> The problem with many of these high-level *aaS projects is that they
> consider their user to be a typical tenant of general cloud
> infrastructure -- focused on launching VMs and creating volumes and
> networks etc. And the discussions around the implementation of these
> projects always comes back to minutia about how to set up secure
> communication channels between a control plane message bus and the
> service VMs.

Incidentally, the reason that discussions always come back to that is
because OpenStack isn't very good at it, which is a huge problem not
only for the *aaS projects but for user applications in general running
on OpenStack.

If we had fine-grained authorisation and ubiquitous multi-tenant
asynchronous messaging in OpenStack then I firmly believe that we, and
application developers, would be in much better shape.

> If you create these projects as applications that run on cloud
> infrastructure (OpenStack, k8s or otherwise),

I'm convinced there's an interesting idea here, but the terminology
you're using doesn't really capture it. When you say 'as applications
that run on cloud infrastructure', it sounds like you mean they should
run in a Nova VM, or in a Kubernetes cluster somewhere, rather than on
the OpenStack control plane. I don't think that's what you mean though,
because you can (and IIUC Rackspace does) deploy OpenStack services that
way already, and it has no real effect on the architecture of those
services.

> then the discussions focus
> instead on how the real end-users -- the ones that actually call the
> APIs and utilize the service -- would interact with the APIs and not the
> underlying infrastructure itself.
>
> Here's an example to think about...
>
> What if a provider of this DBaaS service wanted to jam 100 database
> instances on a single VM and provide connectivity to those database
> instances to 100 different tenants?
>
> Would those tenants know if those databases were all serviced from a
> single database server process running on the VM?

You bet they would when one (or all) of the other 99 decided to run a
really expensive query at an inopportune moment :)

> Or 100 contains each
> running a separate database server process? Or 10 containers running 10
> database server processes each?
>
> No, of course not. And the tenant wouldn't care at all, because the

Well, if they had any kind of regulatory (or even performance)
requirements then the tenant might care really quite a lot. But I take
your point that many might not and it would be good to be able to offer
them lower cost options.

> point of the DBaaS service is to get a database. It isn't to get one or
> more VMs/containers/baremetal servers.

I'm not sure I entirely agree here. There are two kinds of DBaaS. One is
a data API: a multitenant database a la DynamoDB. Those are very cool,
and I'm excited about the potential to reduce the granularity of billing
to a minimum, in much the same way Swift does for storage, and I'm sad
that OpenStack's attempt in this space (MagnetoDB) didn't work out. But
Trove is not that.

People use Trove because they want to use a *particular* database, but
still have all the upgrades, backups, &c. handled for them. Given that
the choice of database is explicitly *not* abstracted away from them,
things like how many different VMs/containers/baremetal servers the
database is running on are very much relevant IMHO, because what you
want depends on both the database and how you're trying to use it. And
because (afaik) none of them have native multitenancy, it's necessary
that no tenant should have to share with any other.

Essentially Trove operates at a moderate level of abstraction -
somewhere between managing the database + the infrastructure it runs on
yourself and just an API endpoint you poke data into. It also operates
at the coarse end of a granularity spectrum running from
VMs->Containers->pay as you go.

It's reasonable to want to move closer to the middle of the granularity
spectrum. But you can't go all the way to the high abstraction/fine
grained ends of the spectra (which turn out to be equivalent) without
becoming something qualitatively different.

> At the end of the day, I think Trove is best implemented as a hosted
> application that exposes an API to its users that is entirely separate
> from the underlying infrastructure APIs like Cinder/Nova/Neutron.
>
> This is similar to Kevin's k8s Operator idea, which I support but in a
> generic fashion that isn't specific to k8s.
>
> In the same way that k8s abstracts the underlying infrastructure (via
> its "cloud provider" concept), I think that Trove and similar projects
> need to use a similar abstraction and focus on providing a different API
> to their users that doesn't leak the underlying infrastructure API
> concepts out.

OK, so trying to summarise (stop me if I'm getting it wrong):
essentially you support option (2) because it is a closed abstraction.
Trove has its own quota management, billing, &c. and the user can't see
the VM, so the operator is free to substitute a different backend that
allocates compute capacity in finer-grained increments than Nova does.

Interestingly, that's only an issue because there is no finer-grained
compute resource than a VM available through the OpenStack API. If there
were an OpenStack API (or even just a Keystone-authenticated API) to a
shared, multitenant container orchestration cluster, this wouldn't be an
issue. But apart from OpenShift, I can't think of any cloud service
that's doing that - AWS, Google, OpenStack are all using the model where
the COE cluster is deployed on VMs that are owned by a particular
tenant. Of all the things you could run in containers on shared servers,
databases have arguably the most to lose (performance, security) and the
least to gain (since they're by definition stateful). So my question is:
if this is such a good idea for databases, why isn't anybody doing it
for everything container-based? i.e. instead of Magnum/Zun should we
just be working on a Keystone auth gateway for OpenShift (a.k.a. the
_one_ thing that _everyone_ had hitherto agreed was definitely out of
scope :D )?

Until then it seems to me that the tradeoff is between decoupling it
from the particular cloud it's running on so that users can optionally
deploy it standalone (essentially Vish's proposed solution for the *aaS
services from many moons ago) vs. decoupling it from OpenStack in
general so that the operator has more flexibility in how to deploy.

I'd love to be able to cover both - from a user using it standalone to
spin up and manage a DB in containers on a shared PaaS, through to a
user accessing it as a service to provide a DB running on a dedicated VM
or bare metal server, and everything in between. I don't know is such a
thing is feasible. I suspect we're going to have to talk a lot about VMs
and network plumbing and volume storage :)

cheers,
Zane.

> Best,
> -jay
>
>> Of course the current situation, as Amrith alluded to, where the
>> default is option (1) except without the lock-down feature in Nova,
>> though some operators are deploying option (2) but it's not tested
>> upstream... clearly that's the worst of all possible worlds, and AIUI
>> nobody disagrees with that.
>>
>> To my mind, (1) sounds more like "applications that run on OpenStack
>> (or other) infrastructure", since it doesn't require stuff like the
>> admin-only cross-project networking that makes it effectively "part of
>> the infrastructure itself" - as evidenced by the fact that
>> unprivileged users can run it standalone with little more than a
>> simple auth middleware change. But I suspect you are going to use
>> similar logic to argue for (2)? I'd be interested to hear your thoughts.
>>
>> cheers,
>> Zane.
>>
>> __________________________________________________________________________
>>
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> [hidden email]?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [hidden email]?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [trove][all][tc] A proposal to rearchitect Trove

Mark Kirkwood
In reply to this post by Jay Pipes
On 21/06/17 02:08, Jay Pipes wrote:

> On 06/20/2017 09:42 AM, Doug Hellmann wrote:
>> Does "service VM" need to be a first-class thing?  Akanda creates
>> them, using a service user. The VMs are tied to a "router" which
>> is the billable resource that the user understands and interacts with
>> through the API.
>
> Frankly, I believe all of these types of services should be built as
> applications that run on OpenStack (or other) infrastructure. In other
> words, they should not be part of the infrastructure itself.
>
> There's really no need for a user of a DBaaS to have access to the
> host or hosts the DB is running on. If the user really wanted that,
> they would just spin up a VM/baremetal server and install the thing
> themselves.
>

Yes, I think this area is where some hard thinking would be rewarded. I
recall when I first met Trove, in my mind I expected to be 'carving off
a piece of database'...and was a bit surprised to discover that it
(essentially) leveraged Nova VM + OS + DB (no criticism intended - just
saying I was surprised). Of course after delving into how it worked I
realized that it did make sense to make use of the various Nova things
(schedulers etc)....*but* now we are thinking about re-architecting
(plus more options exist now), it would make sense to revisit this area.

Best wishes

Mark

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [trove][all][tc] A proposal to rearchitect Trove

Thierry Carrez
In reply to this post by Zane Bitter
Zane Bitter wrote:

> [...]
> Until then it seems to me that the tradeoff is between decoupling it
> from the particular cloud it's running on so that users can optionally
> deploy it standalone (essentially Vish's proposed solution for the *aaS
> services from many moons ago) vs. decoupling it from OpenStack in
> general so that the operator has more flexibility in how to deploy.
>
> I'd love to be able to cover both - from a user using it standalone to
> spin up and manage a DB in containers on a shared PaaS, through to a
> user accessing it as a service to provide a DB running on a dedicated VM
> or bare metal server, and everything in between. I don't know is such a
> thing is feasible. I suspect we're going to have to talk a lot about VMs
> and network plumbing and volume storage :)

As another data point, we are seeing this very same tradeoff with Magnum
vs. Tessmaster (with "I want to get a Kubernetes cluster" rather than "I
want to get a database").

Tessmaster is the user-side tool from EBay deploying Kubernetes on
different underlying cloud infrastructures: takes a bunch of cloud
credentials, then deploys, grows and shrinks Kubernetes cluster for you.

Magnum is the infrastructure-side tool from OpenStack giving you
COE-as-a-service, through a provisioning API.

Jay is advocating for Trove to be more like Tessmaster, and less like
Magnum. I think I agree with Zane that those are two different approaches:

From a public cloud provider perspective serving lots of small users, I
think a provisioning API makes sense. The user in that case is in a
"black box" approach, so I think the resulting resources should not
really be accessible as VMs by the tenant, even if they end up being
Nova VMs. The provisioning API could propose several options (K8s or
Mesos, MySQL or PostgreSQL).

From a private cloud / hybrid cloud / large cloud user perspective, the
user-side deployment tool, letting you deploy the software on various
types of infrastructure, probably makes more sense. It's probably more
work to run it, but you gain in flexibility. That user-side tool would
probably not support multiple options, but be application-specific.

So yes, ideally we would cover both. Because they target different
users, and both are right...

--
Thierry Carrez (ttx)

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [trove][all][tc] A proposal to rearchitect Trove

Davanum Srinivas
On Wed, Jun 21, 2017 at 1:52 AM, Thierry Carrez <[hidden email]> wrote:

> Zane Bitter wrote:
>> [...]
>> Until then it seems to me that the tradeoff is between decoupling it
>> from the particular cloud it's running on so that users can optionally
>> deploy it standalone (essentially Vish's proposed solution for the *aaS
>> services from many moons ago) vs. decoupling it from OpenStack in
>> general so that the operator has more flexibility in how to deploy.
>>
>> I'd love to be able to cover both - from a user using it standalone to
>> spin up and manage a DB in containers on a shared PaaS, through to a
>> user accessing it as a service to provide a DB running on a dedicated VM
>> or bare metal server, and everything in between. I don't know is such a
>> thing is feasible. I suspect we're going to have to talk a lot about VMs
>> and network plumbing and volume storage :)
>
> As another data point, we are seeing this very same tradeoff with Magnum
> vs. Tessmaster (with "I want to get a Kubernetes cluster" rather than "I
> want to get a database").
>
> Tessmaster is the user-side tool from EBay deploying Kubernetes on
> different underlying cloud infrastructures: takes a bunch of cloud
> credentials, then deploys, grows and shrinks Kubernetes cluster for you.
>
> Magnum is the infrastructure-side tool from OpenStack giving you
> COE-as-a-service, through a provisioning API.
>
> Jay is advocating for Trove to be more like Tessmaster, and less like
> Magnum. I think I agree with Zane that those are two different approaches:
>
> From a public cloud provider perspective serving lots of small users, I
> think a provisioning API makes sense. The user in that case is in a
> "black box" approach, so I think the resulting resources should not
> really be accessible as VMs by the tenant, even if they end up being
> Nova VMs. The provisioning API could propose several options (K8s or
> Mesos, MySQL or PostgreSQL).

I like this! ^^ If we can pull off "different underlying cloud
infrastructures" like TessMaster, that would be of more value to folks
who may not be using OpenStack (or VMs!)


>
> From a private cloud / hybrid cloud / large cloud user perspective, the
> user-side deployment tool, letting you deploy the software on various
> types of infrastructure, probably makes more sense. It's probably more
> work to run it, but you gain in flexibility. That user-side tool would
> probably not support multiple options, but be application-specific.
>
> So yes, ideally we would cover both. Because they target different
> users, and both are right...
>
> --
> Thierry Carrez (ttx)
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: [hidden email]?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--
Davanum Srinivas :: https://twitter.com/dims

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [trove][all][tc] A proposal to rearchitect Trove

Zane Bitter
In reply to this post by Mark Kirkwood
On 21/06/17 01:49, Mark Kirkwood wrote:

> On 21/06/17 02:08, Jay Pipes wrote:
>
>> On 06/20/2017 09:42 AM, Doug Hellmann wrote:
>>> Does "service VM" need to be a first-class thing?  Akanda creates
>>> them, using a service user. The VMs are tied to a "router" which
>>> is the billable resource that the user understands and interacts with
>>> through the API.
>>
>> Frankly, I believe all of these types of services should be built as
>> applications that run on OpenStack (or other) infrastructure. In other
>> words, they should not be part of the infrastructure itself.
>>
>> There's really no need for a user of a DBaaS to have access to the
>> host or hosts the DB is running on. If the user really wanted that,
>> they would just spin up a VM/baremetal server and install the thing
>> themselves.
>>
>
> Yes, I think this area is where some hard thinking would be rewarded. I
> recall when I first met Trove, in my mind I expected to be 'carving off
> a piece of database'...and was a bit surprised to discover that it
> (essentially) leveraged Nova VM + OS + DB (no criticism intended - just
> saying I was surprised).

I think this is a common mistake (I know I've made it with respect to
other services) when hearing about a new *aaS thing and making
assumptions about the architecture. Here's a helpful way to think about it:

A cloud service has to have robust multitenancy. In the case of DBaaS,
that gives you two options. You can start with a database that is
already multitenant. If that works for your users, great. But many users
just want somebody else to manage $MY_FAVOURITE_DATABASE that is not
multitenant by design. Your only real option in that case is to give
them their own copy and isolate it somehow from everyone else's. This is
the use case that RDS and Trove are designed to solve.

It's important to note that this hasn't changed and isn't going to
change in the foreseeable future. What *has* changed is that there are
now more options for "isolate it somehow from everyone else's" - e.g.
you can use a container instead of a VM.

> Of course after delving into how it worked I
> realized that it did make sense to make use of the various Nova things
> (schedulers etc)....

Fun fact: Trove started out as a *complete fork* of Nova(!).

>*but* now we are thinking about re-architecting
> (plus more options exist now), it would make sense to revisit this area.

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Reply | Threaded
Open this post in threaded view
|

Re: [trove][all][tc] A proposal to rearchitect Trove

Fox, Kevin M
In reply to this post by Thierry Carrez
There already is a user side tools for deploying plumbing onto your own cloud. stuff like Tessmaster itself.

I think the win is being able to extend that k8s with the ability to declaratively request database clusters and manage them.

Its all about the commons.

If you build a Tessmaster clone just to do mariadb, then you share nothing with the other communities and have to reinvent the wheel, yet again. Operators load increases because the tool doesn't function like other tools.

If you rely on a container orchestration engine that's already cross cloud that can be easily deployed by user or cloud operator, and fill in the gaps with what Trove wants to support, easy management of db's, you get to reuse a lot of the commons and the users slight increase in investment in dealing with the bit of extra plumbing in there allows other things to also be easily added to their cluster. Its very rare that a user would need to deploy/manage only a database. The net load on the operator decreases, not increases.

Look at helm apps for some examples. They do complex web applications that have web tiers, database tiers, etc. But they currently suffer from lack of good support for clustered databases. In the end, the majority of users care about
helm install my_scalable_app kind of things rather then installing all the things by hand. Its a pain.

OpenStack itself has this issue. It has lots of an api tiers and a db tiers. If Trove was a k8s operator, OpenStack on k8s could use it to deploy the rest of OpenStack. Even more sharing.

Thanks,
Kevin
________________________________________
From: Thierry Carrez [[hidden email]]
Sent: Wednesday, June 21, 2017 1:52 AM
To: [hidden email]
Subject: Re: [openstack-dev] [trove][all][tc] A proposal to rearchitect Trove

Zane Bitter wrote:

> [...]
> Until then it seems to me that the tradeoff is between decoupling it
> from the particular cloud it's running on so that users can optionally
> deploy it standalone (essentially Vish's proposed solution for the *aaS
> services from many moons ago) vs. decoupling it from OpenStack in
> general so that the operator has more flexibility in how to deploy.
>
> I'd love to be able to cover both - from a user using it standalone to
> spin up and manage a DB in containers on a shared PaaS, through to a
> user accessing it as a service to provide a DB running on a dedicated VM
> or bare metal server, and everything in between. I don't know is such a
> thing is feasible. I suspect we're going to have to talk a lot about VMs
> and network plumbing and volume storage :)

As another data point, we are seeing this very same tradeoff with Magnum
vs. Tessmaster (with "I want to get a Kubernetes cluster" rather than "I
want to get a database").

Tessmaster is the user-side tool from EBay deploying Kubernetes on
different underlying cloud infrastructures: takes a bunch of cloud
credentials, then deploys, grows and shrinks Kubernetes cluster for you.

Magnum is the infrastructure-side tool from OpenStack giving you
COE-as-a-service, through a provisioning API.

Jay is advocating for Trove to be more like Tessmaster, and less like
Magnum. I think I agree with Zane that those are two different approaches:

From a public cloud provider perspective serving lots of small users, I
think a provisioning API makes sense. The user in that case is in a
"black box" approach, so I think the resulting resources should not
really be accessible as VMs by the tenant, even if they end up being
Nova VMs. The provisioning API could propose several options (K8s or
Mesos, MySQL or PostgreSQL).

From a private cloud / hybrid cloud / large cloud user perspective, the
user-side deployment tool, letting you deploy the software on various
types of infrastructure, probably makes more sense. It's probably more
work to run it, but you gain in flexibility. That user-side tool would
probably not support multiple options, but be application-specific.

So yes, ideally we would cover both. Because they target different
users, and both are right...

--
Thierry Carrez (ttx)

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [hidden email]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
123