|
|
|
|
|
|
| Author |
Message |
Robert Watson *nix forums Guru Wannabe
Joined: 22 Mar 2002
Posts: 218
|
Posted: Sun Jun 11, 2006 1:25 pm Post subject:
MFC of socket/protocol reference improvements
|
|
|
Dear All,
I'm in the process of evaluating a possible MFC over the socket/protocol
reference changes I made in April, 2006, to the RELENG_6 branch. Over the
past few months, these changes have been gradually refined, and a number of
bugs (of varying severity) have been fixed. These changes are important
because they close a significant number of races, reduce the locking overhead
and improve parallelism, and lay the groundwork for future improvement in the
socket and protocol code. However, they also make significant changes in a
number of important network protocols (such as TCP), and change the semantics
of the socket/protocol interface (protosw.h). I think these changes are
important to our short and long term goals of improving network stack
performance and architecture.
My original plan was to start looking in detail at the MFC after about three
months of settling time, which is in about three weeks. I continue to plan to
do this. A few specific points for discussion:
(1) Normally, RELENG_* has significant constraints on changes to the kernel
APIs used by loadable modules -- especially for device drivers. In the
past, we've not made a lot of changes to the protocol switch interface,
and historically it hasn't been a run-time extensible interface. Andre
has recent made changes to allow IP protocols to be loaded at runtime,
such as IP divert, and these will be affected, however. Do we consider
modules programmed against these interfaces to be "breakable" -- i.e., the
require a recompile and or changes in the RELENG_6 branch?
(2) More testing would really be appreciated. I caught a subtle bug in the
handling of the retransmit timer in the context of my changes by accident
as a result of close analysis of TCP traces through a firewall -- I
noticed some "odd" packets that were, with a bit of time, tracked to their
source. However, this sort of thing is really subtle. Any help
determining whether there are other regressions in TCP behavior would be
greatly appreciated. While we've now hammered on the new code quite a
bit, and it fixes some known panics in RELENG_6, I would categorize these
changes as high risk, as they touch quite sensitive and heavily deployed
code. Getting the 7.x code tested in diverse high load environments
before the MFC would be very good.
I'm still in the process of looking at further refinements of the
socket/protocol relationship, which may be candidates for future merging, and
will depend on these changes. Among other things, I've been looking at
further evolving the notion of socket close vs. socket detach, which are
currently conflated notions, leading to both a lack of clarity and lack of
flexibility in the current API. In turn, that has presented a problem with
experimenting with alternative locking strategies, such as vertical
integration of locks between the socket and protocol layers. Getting these
changes into RELENG_6 will depend on these earlier changes being merged.
Robert N M Watson
Computer Laboratory
Universty of Cambridge
_______________________________________________
freebsd-arch@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" |
|
| Back to top |
|
 |
Doug White *nix forums beginner
Joined: 10 Aug 2002
Posts: 26
|
Posted: Fri Jun 16, 2006 1:01 am Post subject:
Re: MFC of socket/protocol reference improvements
|
|
|
On Sun, 11 Jun 2006, Robert Watson wrote:
| Quote: | (1) Normally, RELENG_* has significant constraints on changes to the kernel
APIs used by loadable modules -- especially for device drivers. In the
past, we've not made a lot of changes to the protocol switch interface,
and historically it hasn't been a run-time extensible interface. Andre
has recent made changes to allow IP protocols to be loaded at runtime,
such as IP divert, and these will be affected, however. Do we consider
modules programmed against these interfaces to be "breakable" -- i.e.,
the
require a recompile and or changes in the RELENG_6 branch?
From a policy standpoint, breaking modules in a -STABLE branch is
forbidden since it causes pain for 3rd party developers. Exceptions can be |
made for changes that provide more benefit than damage caused by breaking
the ABI.
Usually the question revolves around, "Does anyone actually distribute
modules that use that interface?" :-)
--
Doug White | FreeBSD: The Power to Serve
dwhite@gumbysoft.com | www.FreeBSD.org
_______________________________________________
freebsd-arch@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" |
|
| Back to top |
|
 |
Robert Watson *nix forums Guru Wannabe
Joined: 22 Mar 2002
Posts: 218
|
Posted: Fri Jun 16, 2006 9:56 am Post subject:
Re: MFC of socket/protocol reference improvements
|
|
|
On Thu, 15 Jun 2006, Doug White wrote:
| Quote: | On Sun, 11 Jun 2006, Robert Watson wrote:
(1) Normally, RELENG_* has significant constraints on changes to the kernel
APIs used by loadable modules -- especially for device drivers. In the
past, we've not made a lot of changes to the protocol switch interface,
and historically it hasn't been a run-time extensible interface. Andre
has recent made changes to allow IP protocols to be loaded at runtime,
such as IP divert, and these will be affected, however. Do we consider
modules programmed against these interfaces to be "breakable" -- i.e.,
the
require a recompile and or changes in the RELENG_6 branch?
From a policy standpoint, breaking modules in a -STABLE branch is forbidden
since it causes pain for 3rd party developers. Exceptions can be made for
changes that provide more benefit than damage caused by breaking the ABI.
Usually the question revolves around, "Does anyone actually distribute
modules that use that interface?"
|
So far I've identified one third party infiniband stack used in a product that
implements the protosw API, and would need changes. However, the scope and
nature of the product mean that this wouldn't be a significant issue for them.
I guess the question is whether and how many other protocol modules exist out
there. My leaning is to say very few, but perhaps those people simply don't
talk to me/us. It would be good to know if there are any other significant
protocol stacks being distributed in binary form.
The source changes are relatively minor to update a protocol, but do need to
be done. I have some more changes in the works that follow on the heels of
this change that require a slightly larger set of protocol changes -- all in
the management of socket setup and teardown (attach, abort, close, detach).
The larger overhaul portion of my changes is within protocols, and those
changes largely rely on the protocol switch changes.
These changes do a couple of things:
(1) Close a number of known races and eliminate a number of known panics. For
example, a panic in tcp_ctloutput() has been repeatedly reported, and is
fixed by this because so_pcb can no longer suddenly become NULL during a
call to setsockopt().
(2) Reduce lock contention on the tcbinfo lock by avoiding calling it in the
socket send path in a significant number of situations -- specifically,
normal, off-the-self send and receive on TCP.
(3) Lay the groundwork for future changes to break down tcbinfo and otherwise
optimize TCP locking.
We can do (1) through some workarounds I've been looking at in a few cases,
and I believe specifically in the setsockopt() case. (2) and (3) are
basically impossible without these changes, so if we want to get them into
RELENG_6 as opposed to just RELENG_7, merging this set of changes is
necessary.
I hope to post an initial draft patch in a few days -- I'm currently setting
up a Perforce branch and merging the changes to that branch, but there are
about 65 commits involved.
Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
freebsd-arch@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" |
|
| Back to top |
|
 |
Doug Barton *nix forums addict
Joined: 24 Apr 2002
Posts: 91
|
Posted: Fri Jun 16, 2006 6:42 pm Post subject:
Re: MFC of socket/protocol reference improvements
|
|
|
I'm opposed to this kind of change on principle. The issue of subtle but
incompatible changes to the API within a major release branch has come up
before as one of the reasons that vendors find it difficult/uninteresting to
support FreeBSD. With our new shorter release cycles, I think we need to
draw a line in the sand and declare unambiguously that the APIs will NOT
change within a release, and then live (and learn) with the consequences.
If these (or any other) changes become so compelling that we feel the
userbase needs to have them ASAP, I would suggest that this would be
justification to move up the release cycle for 7.x, not to break faith in 6.x.
Doug
--
This .signature sanitized for your protection
_______________________________________________
freebsd-arch@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" |
|
| Back to top |
|
 |
Robert Watson *nix forums Guru Wannabe
Joined: 22 Mar 2002
Posts: 218
|
Posted: Fri Jun 16, 2006 6:53 pm Post subject:
Re: MFC of socket/protocol reference improvements
|
|
|
On Fri, 16 Jun 2006, Doug Barton wrote:
| Quote: | I'm opposed to this kind of change on principle. The issue of subtle but
incompatible changes to the API within a major release branch has come up
before as one of the reasons that vendors find it difficult/uninteresting to
support FreeBSD. With our new shorter release cycles, I think we need to
draw a line in the sand and declare unambiguously that the APIs will NOT
change within a release, and then live (and learn) with the consequences.
If these (or any other) changes become so compelling that we feel the
userbase needs to have them ASAP, I would suggest that this would be
justification to move up the release cycle for 7.x, not to break faith in
6.x.
|
Understand that this is a KPI -- kernel programm interface, and not an API.
No applications will be affected by this change, since the sockets API and
protocol management APIs for applications are untouched. In general, we don't
maintain KPI compatibility in a branch except for some specific kernel
interfaces, such as device driver interfaces, which are untouched by the
proposed changes. In fact, the third party infiniband stack and the SCTP
stack (which will be merged soon) are the first cases I know of where
significant third party code bases program against this kernel interface on
FreeBSD, other (historically) than early KAME development. And both the SCTP
stack and KAME stack required changes to the same KPI changed by these
patches, so that's really the Infiniband stack is the only one instance we
know of, and is used by a vendor that already extensively modifies the kernel
in ways that will be affected by even very minor changes within other parts of
the kernel.
So I'm not just looking for objection on principle, I'm looking for objection
based on practice: do we know of third parties extending the kernel with this
KPI who distribute their work and will be affected by this in ways that make
it difficult for them to maintain their component? Remember that if they
compile their module against the updated kernel, they will get warnings
indicating the KPI changes have taken place, since the prototypes of the
affected protosw entries will change.
The question is really whether we want to rule out any further TCP and socket
structural improvements for RELENG_6 based on kernel modules that only
hypothetically exist or not. If we can show they exist, then that's another
issue, but it requires organizations to have written entire protocol stacks
from the ground up, which is (one presumes) relatively rare, and to a KPI that
has in the past changed frequently (not the protosw interfaces, but certainly
other aspects of socket behavior that are immediately relevant).
Also as an FYI: this does not affect consumers of sockets in the kernel, such
as distributed file systems, only implementors of protocols themselves, such
as TCP/IP, AppleTalk, IPX/SPX, etc.
Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
freebsd-arch@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" |
|
| Back to top |
|
 |
Doug Barton *nix forums addict
Joined: 24 Apr 2002
Posts: 91
|
Posted: Fri Jun 16, 2006 7:18 pm Post subject:
Re: MFC of socket/protocol reference improvements
|
|
|
Robert Watson wrote:
| Quote: | So I'm not just looking for objection on principle, I'm looking for
objection based on practice: do we know of third parties extending the
kernel with this KPI who distribute their work and will be affected by
this in ways that make it difficult for them to maintain their
component? Remember that if they compile their module against the
updated kernel, they will get warnings indicating the KPI changes have
taken place, since the prototypes of the affected protosw entries will
change.
|
First off, it's entirely possible that my knowledge of the programming
issues involved is not sufficient to make an intelligent judgment on this
topic. If that's the case here, I apologize for wasting everyone's time.
That said however, I reject your hypothesis that a third party developer who
is depending on the status quo has to make themselves known before you're
willing to reconsider changing things in RELENG_6. There are any number of
reasons why this might be impossible or undesirable, not the least of which
is that no one from that vendor is subscribed to -arch.
On a more fundamental level, what I'm asking for is a clear bright line, and
what you're saying is that it's ok for the line to be fuzzy, where some
things can always be changed because they don't affect anyone we know about,
other things can sometimes be changed if the pain is minimal, etc. I fully
concede that from a developer standpoint, you might be right, and you're in
a much better position than I to make that call. However from a business
standpoint (and I am forced to where my businessman hat more than my
developer hat nowadays, c'est la vie), clear bright lines are good, and
fuzzy lines that depend on the (perceived) whims of people I don't know and
don't have any authority over are bad.
Doug
--
This .signature sanitized for your protection
_______________________________________________
freebsd-arch@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" |
|
| Back to top |
|
 |
Robert Watson *nix forums Guru Wannabe
Joined: 22 Mar 2002
Posts: 218
|
Posted: Fri Jun 16, 2006 7:34 pm Post subject:
Re: MFC of socket/protocol reference improvements
|
|
|
On Fri, 16 Jun 2006, Doug Barton wrote:
| Quote: | That said however, I reject your hypothesis that a third party developer who
is depending on the status quo has to make themselves known before you're
willing to reconsider changing things in RELENG_6. There are any number of
reasons why this might be impossible or undesirable, not the least of which
is that no one from that vendor is subscribed to -arch.
On a more fundamental level, what I'm asking for is a clear bright line, and
what you're saying is that it's ok for the line to be fuzzy, where some
things can always be changed because they don't affect anyone we know about,
other things can sometimes be changed if the pain is minimal, etc. I fully
concede that from a developer standpoint, you might be right, and you're in
a much better position than I to make that call. However from a business
standpoint (and I am forced to where my businessman hat more than my
developer hat nowadays, c'est la vie), clear bright lines are good, and
fuzzy lines that depend on the (perceived) whims of people I don't know and
don't have any authority over are bad.
|
I'm also interested in a line, but what I'm trying to determine is where the
line falls: if we have a line around the set of "supported KPIs", a line we've
never really drawn very well in the past, is the protosw KPI on one side of
the line, or the other? The status quot is that the line is fuzzy: in the
past, we've changed related KPIs with some frequency, although I wouldn't call
it wild abandon. We can't say that no interface in the kernel is ever allowed
to change, or what you'd get is a release rather than a branch, with almost no
movement at all in the kernel. Instead, we have to pick certain interfaces we
choose to keep more static in order to support third party developers where it
makes the most sense.
In the past, this has almost always meant device driver vendors, although file
systems and netgraph modules have generally been treated fairly well. It's
made sense for two reasons: first, that it's actually possible and desirable
to maintain the staticness of the KPIs, in part because we have large numbers
of our own internal consumers and changing them all is apain, and in part
because third parties actually have existed who ship products against them
(such as video drivers, ethernet drivers, etc). But what about protosw? Do
these apply?
There are strong technical motivations to not support the protosw interface as
a static KPI: this will allow us to continue to mature our network SMP
implementation, close races, and add new features, such as SCTP (which relies
on expanding the protosw KPI, which will break the ABI, FYI). The question is
whether there are strong technical or organizational motivations *not* to
break it, such as an awareness that this is a KPI that third party developers
actually ever program to and expect to remain static.
Robert N M Watson
Computer Laboratory
University of Cambridge
_______________________________________________
freebsd-arch@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" |
|
| Back to top |
|
 |
gnn@freebsd.org *nix forums beginner
Joined: 11 Aug 2005
Posts: 15
|
Posted: Sat Jun 17, 2006 4:02 am Post subject:
Re: MFC of socket/protocol reference improvements
|
|
|
At Fri, 16 Jun 2006 20:34:49 +0100 (BST),
rwatson wrote:
| Quote: | I'm also interested in a line, but what I'm trying to determine is
where the line falls: if we have a line around the set of "supported
KPIs", a line we've never really drawn very well in the past, is the
protosw KPI on one side of the line, or the other? The status quot
is that the line is fuzzy: in the past, we've changed related KPIs
with some frequency, although I wouldn't call it wild abandon. We
can't say that no interface in the kernel is ever allowed to change,
or what you'd get is a release rather than a branch, with almost no
movement at all in the kernel. Instead, we have to pick certain
interfaces we choose to keep more static in order to support third
party developers where it makes the most sense.
In the past, this has almost always meant device driver vendors,
although file systems and netgraph modules have generally been
treated fairly well. It's made sense for two reasons: first, that
it's actually possible and desirable to maintain the staticness of
the KPIs, in part because we have large numbers of our own internal
consumers and changing them all is apain, and in part because third
parties actually have existed who ship products against them (such
as video drivers, ethernet drivers, etc). But what about protosw?
Do these apply?
There are strong technical motivations to not support the protosw
interface as a static KPI: this will allow us to continue to mature
our network SMP implementation, close races, and add new features,
such as SCTP (which relies on expanding the protosw KPI, which will
break the ABI, FYI). The question is whether there are strong
technical or organizational motivations *not* to break it, such as
an awareness that this is a KPI that third party developers actually
ever program to and expect to remain static.
|
My thoughts on these changes are that, unlike device drivers, there
aren't hundreds of protocols that would be effected by a change in the
protosw. Certainly there are more than existed in the past, but it is
very likely less than 10. I think we need to also consider the fact
that these changes, which a lot of us are using/testing in HEAD, do
improve the performance and stability of the protocol used most often
in FreeBSD, that is TCP. I think that for those two reasons we should
do this but ONLY at the end of the current release cycle so as to give
people on 6 the maximum amount of time to deal with this issue.
Best,
George
_______________________________________________
freebsd-arch@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" |
|
| Back to top |
|
 |
Google
|
|
| Back to top |
|
 |
|
|
The time now is Tue Dec 02, 2008 1:10 pm | All times are GMT
|
|
Car Loan | Mortgage Calculator | Credit Reports | Mortgages | Loan
|
|
Copyright © 2004-2005 DeniX Solutions SRL
|
|
|
|
Other DeniX Solutions sites:
Unix/Linux blog |
electronics forum |
medicine forum |
science forum |
|
|
Privacy Policy
|
Powered by phpBB © 2001, 2005 phpBB Group
|
|