|
|
|
|
|
|
| Author |
Message |
Alan J. Flavell *nix forums Guru
Joined: 05 Mar 2005
Posts: 311
|
Posted: Fri Jul 21, 2006 8:58 am Post subject:
Re: CGI.pm and lost carriage returns
|
|
|
On Thu, 20 Jul 2006, David Squire wrote:
| Quote: | Joseph Czapski wrote:
Using
$value = $q->param($name);
gives me the text with all the carriage returns deleted. Some
words are just stuck together where they were separated by only
one or more carriage returns.
How and where are you displaying $value to make this judgment? In a
web browser? If so, not that HTML does not recognize carriage
returns - it uses <BR> (or <BR/> for XHTML ) to indicate line
breaks.
|
Apart from being wrong in detail, as already pointed out, this seems
to me to be bizarrely wrong at the level of principles too. Even
though they are, strictly speaking, off-topic for this group, I feel
bound to make a comment.
The format of a submitted textarea is reasonably well specified in the
real HTML specification,
http://www.w3.org/TR/REC-html40/interact/forms.html#h-17.7
(in conjunction with
http://www.w3.org/TR/REC-html40/interact/forms.html#h-17.13.3 ).
(This is not confuddled by proprietary "wrap=" attributes, which are
implemented in diverse and confusing ways. Obviously I've noted the
subsequent discussion about browsers inserting newlines for local
display purposes only, and not sending them as part of the submitted
data).
Anyhow, my point is that the submitted data (once the form submission
encoding layer has been unwrapped at the server side) is in principle
*plain text*.
Sure, that plain text *could* be HTML "source", or equally it could be
C++ source or a Perl script or... just plain *plain text*.
The idea of simply stuffing-in <br> tags wherever a newline is seen in
the source is quite bizarre to me. If you want to produce proper HTML
from what was meant to be plain text then you need a properly defined
procedure for doing so (you see such functionality in the
editing features of various Wikis, for example).
On the other hand if your users are expecting to be inputting HTML
"source code", you sure don't want to go inserting unsolicited tags.
You might very well want to analyze the input for potentially
compromising markup, though (scripting attacks and such).
| Quote: | Still, it should at least treat them as white space...
|
What's "it" meant to be in this sentence? Have we even understood
what it is that the O.P is intending to achieve? Whatever it is, I'm
highly sceptical of the server-side processing merely sprinkling the
input with <br> tags instead of newlines, and nothing more: it does
not seem to be a solution to any variant of this problem that I can
think of. BICBW, of course.
regards
--
If the crash doesn't occur immediately, the [development] cycle is broken,
and the result is called a release. -- detha, in the monastery. |
|
| Back to top |
|
 |
David Squire *nix forums Guru Wannabe
Joined: 08 Apr 2006
Posts: 197
|
Posted: Fri Jul 21, 2006 9:32 am Post subject:
Re: CGI.pm and lost carriage returns
|
|
|
Alan J. Flavell wrote:
| Quote: | I'm
highly sceptical of the server-side processing merely sprinkling the
input with <br> tags instead of newlines, and nothing more: it does
not seem to be a solution to any variant of this problem that I can
think of. BICBW, of course.
|
Hmmm. I see it so often that I would almost call it a FAQ. People ask
"where did my linebreaks go?" when displaying text in a browser. This is
due to not realizing that HTML does not use CR, LF etc. for this purpose.
A common situation where this might arise is a simple comment field
where the comment typed is to be displayed on an HTML page, and the
designer wants user newlines to be retained in formatting. Often <BR>
tags is all that is needed to get the desired effect... and indeed the
OP has already indicated that doing just that solved his problem.
DS |
|
| Back to top |
|
 |
Alan J. Flavell *nix forums Guru
Joined: 05 Mar 2005
Posts: 311
|
Posted: Fri Jul 21, 2006 10:10 am Post subject:
Re: CGI.pm and lost carriage returns
|
|
|
On Fri, 21 Jul 2006, David Squire wrote:
| Quote: | Alan J. Flavell wrote:
I'm highly sceptical of the server-side processing merely sprinkling the
input with <br> tags instead of newlines, and nothing more: it does not seem
to be a solution to any variant of this problem that I can think of. BICBW,
of course.
Hmmm. I see it so often that I would almost call it a FAQ. People
ask "where did my linebreaks go?" when displaying text in a browser.
This is due to not realizing that HTML does not use CR, LF etc. for
this purpose.
|
But the input was *NOT* meant to be HTML in the first place, so
attempting to display it as such is completely illogical. If it's
plain text, then send it as text/plain. Even MSIE has finally caught
up with that concept.
| Quote: | A common situation where this might arise is a simple comment field
where the comment typed is to be displayed on an HTML page, and the
designer wants user newlines to be retained in formatting.
|
Yeah, and then the mischievous user inserts some naughty javascript,
or includes a link to some dangerous web page, and soon the damage is
done.
| Quote: | Often <BR> tags is all that is needed
|
*Absolutely not*. Have you *no* sense of network security?
| Quote: | to get the desired effect...
|
The "desired effect" is not half of what you're liable to get, if you
allow arbitrary web users to type their choice of HTML and you calmly
insert it into your web page.
| Quote: | and indeed the OP has already indicated that doing just that solved
his problem.
|
It might have "solved" what the O.P perceived to be the problem. After
all, the (in)famous Matt would have had no idea when he launched his
Script Archive just what kinds of network abuse he would be
responsible for.
-- |
|
| Back to top |
|
 |
David Squire *nix forums Guru Wannabe
Joined: 08 Apr 2006
Posts: 197
|
Posted: Fri Jul 21, 2006 10:36 am Post subject:
Re: CGI.pm and lost carriage returns
|
|
|
Alan J. Flavell wrote:
| Quote: | On Fri, 21 Jul 2006, David Squire wrote:
Alan J. Flavell wrote:
I'm highly sceptical of the server-side processing merely sprinkling the
input with <br> tags instead of newlines, and nothing more: it does not seem
to be a solution to any variant of this problem that I can think of. BICBW,
of course.
Hmmm. I see it so often that I would almost call it a FAQ. People
ask "where did my linebreaks go?" when displaying text in a browser.
This is due to not realizing that HTML does not use CR, LF etc. for
this purpose.
But the input was *NOT* meant to be HTML in the first place, so
attempting to display it as such is completely illogical.
|
I don't agree with this. You could see it as a terribly simple Wiki
code: only newlines are significant as extra mark-up. There are all
sorts of Wikis around now that take non-HTML mark-up entered as plain
text in forms and convert it to HTML.
| Quote: | Often <BR> tags is all that is needed
*Absolutely not*. Have you *no* sense of network security?
|
Fair enough. Point taken. There would have to be other sanity checks too.
DS |
|
| Back to top |
|
 |
A. Sinan Unur *nix forums Guru
Joined: 03 Mar 2005
Posts: 1840
|
Posted: Fri Jul 21, 2006 11:41 am Post subject:
Re: CGI.pm and lost carriage returns
|
|
|
David Squire <David.Squire@no.spam.from.here.au> wrote in
news:e9qamj$heb$1@gemini.csx.cam.ac.uk:
| Quote: | Alan J. Flavell wrote:
On Fri, 21 Jul 2006, David Squire wrote:
Alan J. Flavell wrote:
I'm highly sceptical of the server-side processing merely
sprinkling the input with <br> tags instead of newlines, and
|
....
| Quote: | Hmmm. I see it so often that I would almost call it a FAQ. People
ask "where did my linebreaks go?" when displaying text in a browser.
This is due to not realizing that HTML does not use CR, LF etc. for
this purpose.
But the input was *NOT* meant to be HTML in the first place, so
attempting to display it as such is completely illogical.
I don't agree with this. You could see it as a terribly simple Wiki
code:
|
....
| Quote: | Often <BR> tags is all that is needed
*Absolutely not*. Have you *no* sense of network security?
Fair enough. Point taken. There would have to be other sanity checks
too.
|
Sanity checks? Checking arbitrary text entered in a textbox is
'difficult' in a very real sense.
What is needed is not checking, but encoding.
Demo available at: http://www.unur.com/cgi-bin/echo.pl
#!/usr/bin/perl
use strict;
use warnings;
use CGI;
$CGI::POSTMAX = 1024;
$CGI::DISABLE_UPLOADS = 1;
use HTML::Entities qw(encode_entities_numeric);
run();
sub run {
my $cgi = CGI->new;
my $text = $cgi->param('text');
if ( defined $text ) {
$text =~ s/\015\012?|\012/\n/;
$text = encode_entities_numeric $text;
$text = join('<br>', split /\n/, $text);
}
print $cgi->header('text/html');
print <<EO_HTML;
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>Textarea Echo</title>
<style type="text/css">
body { margin: 50px 0px; padding:0px; text-align:center; }
#content { width:600px; margin:0px auto; text-align:left; }
textarea { border: 1px solid #99f; }
blockquote { font-style: italic; color: #191; }
input.button { border: 1px solid #99f; }
</style>
</head>
<body id="content">
<h1>Textarea Echo</h1>
<blockquote>$text</blockquote>
<div>
<p><label for="text">Enter some text below</label>:</p>
<form action="http://www.unur.com/cgi-bin/echo.pl" method="POST">
<textarea rows="10" cols="60" name="text" id="text"></textarea>
<div>
<input class="button" type="submit">
<input class="button" type="reset">
</div>
</form>
</div>
</body>
</html>
EO_HTML
}
--
A. Sinan Unur <1usa@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)
comp.lang.perl.misc guidelines on the WWW:
http://augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html |
|
| Back to top |
|
 |
A. Sinan Unur *nix forums Guru
Joined: 03 Mar 2005
Posts: 1840
|
Posted: Fri Jul 21, 2006 12:04 pm Post subject:
Re: CGI.pm and lost carriage returns
|
|
|
"A. Sinan Unur" <1usa@llenroc.ude.invalid> wrote in
news:Xns98074E545E88Basu1cornelledu@127.0.0.1:
| Quote: | $text =~ s/\015\012?|\012/\n/;
|
$text =~ s/\015\012?|\012/\n/g;
Sinan
--
A. Sinan Unur <1usa@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)
comp.lang.perl.misc guidelines on the WWW:
http://augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html |
|
| Back to top |
|
 |
Joseph Czapski *nix forums beginner
Joined: 20 Jul 2006
Posts: 5
|
Posted: Fri Jul 21, 2006 1:28 pm Post subject:
Re: CGI.pm and lost carriage returns
|
|
|
I previously wrote:
| Quote: | Yup. Setting WRAP to 'physical' solved the problem.
.... |
Sorry for the further confusion. Now I think that *eliminating* the WRAP
attribute entirely is the best thing to do. And my code snippet after
getting the $value back from CGI.pm is:
$value =~ s/(\S)\s*?\x0A\s*\x0A\s*?(\S)/$1<br><br>$2/g;
$value =~ s/(\S)\s*?\x0D\s*\x0D\s*?(\S)/$1<br><br>$2/g;
$value =~ s/\s*\x0A\s*/<br>/g;
$value =~ s/\s*\x0D\s*/<br>/g;
Joe Czapski
Boston, Mass. |
|
| Back to top |
|
 |
Google
|
|
| Back to top |
|
 |
|
|
The time now is Sat Nov 22, 2008 8:28 pm | All times are GMT
|
|
Free Credit Score | Payday Loan | Indoor Flag Pole Sets | Charity | Mortgage Calculator
|
|
Copyright © 2004-2005 DeniX Solutions SRL
|
|
|
|
Other DeniX Solutions sites:
Unix/Linux blog |
electronics forum |
medicine forum |
science forum |
|
|
Privacy Policy
|
Powered by phpBB © 2001, 2005 phpBB Group
|
|