niXforums Forum Index
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   PreferencesPreferences   Log in to check your private messagesLog in to check your private messages   Log inLog in 
·  nixdoc.net ·  man pages ·  Linux HOWTOs ·  FreeBSD Tips ·  Forums
navigation Forum index » Programming » Perl
CGI.pm and lost carriage returns
Post new topic   Reply to topic Page 2 of 2 [22 Posts] View previous topic :: View next topic
Goto page:  Previous  1, 2
Author Message
Alan J. Flavell
*nix forums Guru


Joined: 05 Mar 2005
Posts: 311

PostPosted: Fri Jul 21, 2006 8:58 am    Post subject: Re: CGI.pm and lost carriage returns Reply with quote

On Thu, 20 Jul 2006, David Squire wrote:

Quote:
Joseph Czapski wrote:
Using

$value = $q->param($name);

gives me the text with all the carriage returns deleted. Some
words are just stuck together where they were separated by only
one or more carriage returns.

How and where are you displaying $value to make this judgment? In a
web browser? If so, not that HTML does not recognize carriage
returns - it uses <BR> (or <BR/> for XHTML Smile ) to indicate line
breaks.

Apart from being wrong in detail, as already pointed out, this seems
to me to be bizarrely wrong at the level of principles too. Even
though they are, strictly speaking, off-topic for this group, I feel
bound to make a comment.

The format of a submitted textarea is reasonably well specified in the
real HTML specification,
http://www.w3.org/TR/REC-html40/interact/forms.html#h-17.7
(in conjunction with
http://www.w3.org/TR/REC-html40/interact/forms.html#h-17.13.3 ).

(This is not confuddled by proprietary "wrap=" attributes, which are
implemented in diverse and confusing ways. Obviously I've noted the
subsequent discussion about browsers inserting newlines for local
display purposes only, and not sending them as part of the submitted
data).

Anyhow, my point is that the submitted data (once the form submission
encoding layer has been unwrapped at the server side) is in principle
*plain text*.

Sure, that plain text *could* be HTML "source", or equally it could be
C++ source or a Perl script or... just plain *plain text*.

The idea of simply stuffing-in <br> tags wherever a newline is seen in
the source is quite bizarre to me. If you want to produce proper HTML
from what was meant to be plain text then you need a properly defined
procedure for doing so (you see such functionality in the
editing features of various Wikis, for example).

On the other hand if your users are expecting to be inputting HTML
"source code", you sure don't want to go inserting unsolicited tags.
You might very well want to analyze the input for potentially
compromising markup, though (scripting attacks and such).

Quote:
Still, it should at least treat them as white space...

What's "it" meant to be in this sentence? Have we even understood
what it is that the O.P is intending to achieve? Whatever it is, I'm
highly sceptical of the server-side processing merely sprinkling the
input with <br> tags instead of newlines, and nothing more: it does
not seem to be a solution to any variant of this problem that I can
think of. BICBW, of course.

regards

--

If the crash doesn't occur immediately, the [development] cycle is broken,
and the result is called a release. -- detha, in the monastery.
Back to top
David Squire
*nix forums Guru Wannabe


Joined: 08 Apr 2006
Posts: 197

PostPosted: Fri Jul 21, 2006 9:32 am    Post subject: Re: CGI.pm and lost carriage returns Reply with quote

Alan J. Flavell wrote:

Quote:
I'm
highly sceptical of the server-side processing merely sprinkling the
input with <br> tags instead of newlines, and nothing more: it does
not seem to be a solution to any variant of this problem that I can
think of. BICBW, of course.

Hmmm. I see it so often that I would almost call it a FAQ. People ask
"where did my linebreaks go?" when displaying text in a browser. This is
due to not realizing that HTML does not use CR, LF etc. for this purpose.

A common situation where this might arise is a simple comment field
where the comment typed is to be displayed on an HTML page, and the
designer wants user newlines to be retained in formatting. Often <BR>
tags is all that is needed to get the desired effect... and indeed the
OP has already indicated that doing just that solved his problem.

DS
Back to top
Alan J. Flavell
*nix forums Guru


Joined: 05 Mar 2005
Posts: 311

PostPosted: Fri Jul 21, 2006 10:10 am    Post subject: Re: CGI.pm and lost carriage returns Reply with quote

On Fri, 21 Jul 2006, David Squire wrote:

Quote:
Alan J. Flavell wrote:

I'm highly sceptical of the server-side processing merely sprinkling the
input with <br> tags instead of newlines, and nothing more: it does not seem
to be a solution to any variant of this problem that I can think of. BICBW,
of course.

Hmmm. I see it so often that I would almost call it a FAQ. People
ask "where did my linebreaks go?" when displaying text in a browser.
This is due to not realizing that HTML does not use CR, LF etc. for
this purpose.

But the input was *NOT* meant to be HTML in the first place, so
attempting to display it as such is completely illogical. If it's
plain text, then send it as text/plain. Even MSIE has finally caught
up with that concept.

Quote:
A common situation where this might arise is a simple comment field
where the comment typed is to be displayed on an HTML page, and the
designer wants user newlines to be retained in formatting.

Yeah, and then the mischievous user inserts some naughty javascript,
or includes a link to some dangerous web page, and soon the damage is
done.

Quote:
Often <BR> tags is all that is needed

*Absolutely not*. Have you *no* sense of network security?

Quote:
to get the desired effect...

The "desired effect" is not half of what you're liable to get, if you
allow arbitrary web users to type their choice of HTML and you calmly
insert it into your web page.

Quote:
and indeed the OP has already indicated that doing just that solved
his problem.

It might have "solved" what the O.P perceived to be the problem. After
all, the (in)famous Matt would have had no idea when he launched his
Script Archive just what kinds of network abuse he would be
responsible for.

--
Back to top
David Squire
*nix forums Guru Wannabe


Joined: 08 Apr 2006
Posts: 197

PostPosted: Fri Jul 21, 2006 10:36 am    Post subject: Re: CGI.pm and lost carriage returns Reply with quote

Alan J. Flavell wrote:
Quote:
On Fri, 21 Jul 2006, David Squire wrote:

Alan J. Flavell wrote:

I'm highly sceptical of the server-side processing merely sprinkling the
input with <br> tags instead of newlines, and nothing more: it does not seem
to be a solution to any variant of this problem that I can think of. BICBW,
of course.
Hmmm. I see it so often that I would almost call it a FAQ. People
ask "where did my linebreaks go?" when displaying text in a browser.
This is due to not realizing that HTML does not use CR, LF etc. for
this purpose.

But the input was *NOT* meant to be HTML in the first place, so
attempting to display it as such is completely illogical.

I don't agree with this. You could see it as a terribly simple Wiki
code: only newlines are significant as extra mark-up. There are all
sorts of Wikis around now that take non-HTML mark-up entered as plain
text in forms and convert it to HTML.

Quote:
Often <BR> tags is all that is needed

*Absolutely not*. Have you *no* sense of network security?

Fair enough. Point taken. There would have to be other sanity checks too.


DS
Back to top
A. Sinan Unur
*nix forums Guru


Joined: 03 Mar 2005
Posts: 1840

PostPosted: Fri Jul 21, 2006 11:41 am    Post subject: Re: CGI.pm and lost carriage returns Reply with quote

David Squire <David.Squire@no.spam.from.here.au> wrote in
news:e9qamj$heb$1@gemini.csx.cam.ac.uk:

Quote:
Alan J. Flavell wrote:
On Fri, 21 Jul 2006, David Squire wrote:

Alan J. Flavell wrote:

I'm highly sceptical of the server-side processing merely
sprinkling the input with <br> tags instead of newlines, and

....

Quote:
Hmmm. I see it so often that I would almost call it a FAQ. People
ask "where did my linebreaks go?" when displaying text in a browser.
This is due to not realizing that HTML does not use CR, LF etc. for
this purpose.

But the input was *NOT* meant to be HTML in the first place, so
attempting to display it as such is completely illogical.

I don't agree with this. You could see it as a terribly simple Wiki
code:

....

Quote:
Often <BR> tags is all that is needed

*Absolutely not*. Have you *no* sense of network security?

Fair enough. Point taken. There would have to be other sanity checks
too.

Sanity checks? Checking arbitrary text entered in a textbox is
'difficult' in a very real sense.

What is needed is not checking, but encoding.

Demo available at: http://www.unur.com/cgi-bin/echo.pl

#!/usr/bin/perl

use strict;
use warnings;

use CGI;
$CGI::POSTMAX = 1024;
$CGI::DISABLE_UPLOADS = 1;

use HTML::Entities qw(encode_entities_numeric);

run();

sub run {
my $cgi = CGI->new;
my $text = $cgi->param('text');
if ( defined $text ) {
$text =~ s/\015\012?|\012/\n/;
$text = encode_entities_numeric $text;
$text = join('<br>', split /\n/, $text);
}
print $cgi->header('text/html');
print <<EO_HTML;
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>Textarea Echo</title>
<style type="text/css">
body { margin: 50px 0px; padding:0px; text-align:center; }
#content { width:600px; margin:0px auto; text-align:left; }
textarea { border: 1px solid #99f; }
blockquote { font-style: italic; color: #191; }
input.button { border: 1px solid #99f; }
</style>
</head>
<body id="content">
<h1>Textarea Echo</h1>
<blockquote>$text</blockquote>
<div>
<p><label for="text">Enter some text below</label>:</p>
<form action="http://www.unur.com/cgi-bin/echo.pl" method="POST">
<textarea rows="10" cols="60" name="text" id="text"></textarea>
<div>
<input class="button" type="submit">
<input class="button" type="reset">
</div>
</form>
</div>
</body>
</html>
EO_HTML
}





--
A. Sinan Unur <1usa@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
Back to top
A. Sinan Unur
*nix forums Guru


Joined: 03 Mar 2005
Posts: 1840

PostPosted: Fri Jul 21, 2006 12:04 pm    Post subject: Re: CGI.pm and lost carriage returns Reply with quote

"A. Sinan Unur" <1usa@llenroc.ude.invalid> wrote in
news:Xns98074E545E88Basu1cornelledu@127.0.0.1:

Quote:
$text =~ s/\015\012?|\012/\n/;

$text =~ s/\015\012?|\012/\n/g;

Sinan
--
A. Sinan Unur <1usa@llenroc.ude.invalid>
(remove .invalid and reverse each component for email address)

comp.lang.perl.misc guidelines on the WWW:
http://augustmail.com/~tadmc/clpmisc/clpmisc_guidelines.html
Back to top
Joseph Czapski
*nix forums beginner


Joined: 20 Jul 2006
Posts: 5

PostPosted: Fri Jul 21, 2006 1:28 pm    Post subject: Re: CGI.pm and lost carriage returns Reply with quote

I previously wrote:
Quote:
Yup. Setting WRAP to 'physical' solved the problem.
....


Sorry for the further confusion. Now I think that *eliminating* the WRAP
attribute entirely is the best thing to do. And my code snippet after
getting the $value back from CGI.pm is:

$value =~ s/(\S)\s*?\x0A\s*\x0A\s*?(\S)/$1<br><br>$2/g;
$value =~ s/(\S)\s*?\x0D\s*\x0D\s*?(\S)/$1<br><br>$2/g;
$value =~ s/\s*\x0A\s*/<br>/g;
$value =~ s/\s*\x0D\s*/<br>/g;


Joe Czapski
Boston, Mass.
Back to top
Google

Back to top
Display posts from previous:   
Post new topic   Reply to topic Page 2 of 2 [22 Posts] Goto page:  Previous  1, 2
View previous topic :: View next topic
The time now is Sat Nov 22, 2008 8:28 pm | All times are GMT
navigation Forum index » Programming » Perl
Jump to:  

Similar Topics
Topic Author Forum Replies Last Post
No new posts ConfigParser: what read('non-existent-filename') returns ... Danil Dotsenko python 4 Thu Jul 20, 2006 5:50 pm
No new posts "who am i" returns always "localhost" ¦a²y¤H AIX 5 Tue Jul 18, 2006 12:36 pm
No new posts HELP: lost part of /usr/sbin, have backup but no frecover... John Burns HP-UX 0 Mon Jul 17, 2006 6:55 pm
No new posts clock() returns 0 gramos2000@gmail.com apps 1 Mon Jul 17, 2006 4:16 pm
No new posts urllib (in thread) never returns Kingsley python 1 Mon Jul 17, 2006 12:53 am

Free Credit Score | Payday Loan | Indoor Flag Pole Sets | Charity | Mortgage Calculator
Copyright © 2004-2005 DeniX Solutions SRL
 
Other DeniX Solutions sites: Unix/Linux blog |  electronics forum |  medicine forum |  science forum | 
Privacy Policy


Powered by phpBB © 2001, 2005 phpBB Group
[ Time: 0.2311s ][ Queries: 16 (0.1201s) ][ GZIP on - Debug on ]